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This report reviews research on the effects of 
between- and within-class ability grouping on the achievement of 
elementary school students. The review technique, known as 
"best-evidence synthesis," combines features of meta-analytic and 
narrative reviews. Overall, evidence does not support assignment of 
students to self-contained classes according to ability, but grouping 
plans involving cross -grade assignment for selected subjects can 
increase student achievement. Research particularly supports the 
Joplin Plan, croe,s-grade ability grouiing for reading only, and forms 
of nongraded programs involving multiple groupings for different 
subjects. Within-class ability grouping in mathematics is also found 
to be instructionally effective. Ability grouping is held to be 
maximally effective: (1) when it is done only for one or two 
subjects, with students remaining in heterogeneous classes most of 
the day; (2) when it greatly reduces student heterogeneity in a 
specific skill; (3) when group assignments are frequently reassessed; 
and (4) when teachers vary the level and pace of instruction 
according to students* needs. (An 18-page reference list is 
appended) . (Author/RK) 
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The center 



The mission of the Center for Research on Elementary end 
Middle Schools is to produce useful knowledge about how ele- 
mentary and middle schools Can foster growth in students 1 
learning an^ 

methods for improving the effectiveness of elementary and 
middle schools based on existing and new research findings, 
and to develop and evaluate specific strategies to help 
schools implement effective research-based school and class- 
room practices* / 

The Center conducts its research in three program areas: 
(1) Elementary Schools; (2) Middle Schools, and (3) School 
improvement • 

Jhfl Elementary Sshs&l Program 

This program works from a strong existing research base 
to develop, evaluate, and disseminate effective elementary 
school and classroom practices; synthesizes current know- 
ledge; and analyzes survey and de script i ve data to expand 
the knowledge base in effective elementary education* " 

lbs Middle JSchfifil £istsnm 

This program 1 s research links current knowledge about 
early adolescence as a stage of human development to school 
organization and classroom pol Icies and practices for Effec- 
tive middle schools. The major task is to establish a 
research base to identify specific problem areas and promis- 
ing -practices in middle schools that will contribute to 
effective policy decisions and the development of effective 
school and classroom practices. 

School improvement Program 

This program focuses on improving the organizational per- 
formance of schools in adopting and adapting innovations and 
developing school capacity for change. 



This report, prepared by the Elementary School Program, 
synthesizes research on ability grouping in elementary 
schools to identify grouping practices that promote student 
achievement. 



Ability Grouping and student Achievement in Elementary Schools: 

■ • •' ■••*». ... 



This article reviews research on the effects of between- and 
within-class ability grouping on the achievement of elementary 
school students. The review technique, best-evidence synthesis, 
combines features of metar analytic and narrative reviews* Overall, 
evidence does not support assignment of students to self-contained 
classes according to ability (median ES ■ .00) , but grouping plans 
involving cross-grade assignment for selected sub j acts can increase 
student achievement. Research particularly supports the Jopl in 
Plan, cross-grade ability grouping for reading only (median ES ■ 
+.45) and forms of nongraded programs involving multiple groupings 
for different subjects (median ES ■ +.29). within-class ability 
grouping in mathematics is also found to be instructionally effec- 
tive (median ES * +.34). Ability grouping is held to be maximally 
effective when it is done only for one or two subjects, with stu- 
dents remaining in heterogeneous classes most of the day; when it 
greatly reduces student heterogeneity in a specific skill; when 
group assignments are frequently reassessed; and when teachers vary 
the level and pace of instruction according to students 1 needs. 
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Ability Grouping anc| Student Achiev^ent in Momentary Schools: 



Ability grouping is one of the oldest and most controversial 
issues in educational psychology. Hundreds of studies have examined 
the effects of various forms of between-class ability grouping 
(e.g., tracking, streaming) and within-class ability grouping (e.g., 
reading, math groups). By 1930, Miller and Otto had already located 
twenty experimental studies on ability grouping, and Martin (1927) 
listed eighty-three "selected references" on the topic. 

Scores of reviews of the between-class ability grouping liter a- 
ture have been written. Almost without exception, reviews from the 
1920*6 to the present have come to the same general conclusion: that 
between-class ability grouping has few if any benefits for student 
achievement. Recently, meta-analyses on ability grouping in elemen- 
tary (C-L. Kulik and J. Kulik, 1984) and in secondary schools (Kulik 
6 Kulik, 1982) have claimed small positive achievement effects of 
between-class ability grouping, with high achievers gaining the most 
from the practice. 

Despite a half-century of widespread agreement (among research- 
ers, at least) that between-class ability grouping is of little 
value in enhancing student achievement, the practice is nearly univ- 
ersal in some form in secondary schools and very common in elemen- 
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tary schools. Recent da ta are I acking, but over time most teachers 

>bil ity grouping •'<«. NEA, 0,96 8? Wilson iE» SchmijtSf 197 8) . Vet , in 
recent years many districts have begun to reexamine ability group- 
ing, often out of a concern that, students low in socioeconomic sta- 
tus f in particular minority students* are discriminated against by 
being disproportionately placed in low tracks. In fact* ability 
grouping has become a major issue in many ongoing desegration cases 
(e.g., Hobeon vs. Hansen . 1967) where the plaintiffs have argued 
that ability grouping is used as a means of resegregating Black and 
Hispanic students within ostensibly integrated schools (see McPart- 

■*■ .i Lit n _ 



land, 1968). 




Although many reviews of ability .growing have been written, the 
most recent comprehensive reviews this area were written more 
than sixteen years ago (e.g., Borg, 1965; Findley & uryan, 1971; 
Heathers, 1969; NEA, 1968). More recent reviews (e.g., Esposito, 
1973; Good & Marshall, 1984; Persell, 1977) have referred to the 
earlier reviews rather than synthesizing the original evidence. The 
Kuliks 1 meta-analyses have extracted effect-size data from large 
numbers of primary studies, but have done little beyond this to 
explore the substantive and methodological issues underlying these 
effects (see Slavin, 1984a). 

The present paper reviews the literature on ability grouping in 
elementary schools from the vantage point of the 1980' s. It uses a 
review strategy called "Best- Evidence Synthesis" (Slavin, 1986), a 
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method which incorporates the beat features of both meta-analytic '• 
.•; and; tradiMpnal na r ratiye. rev iew % The main elements of a i*»t;^yi*;:.". 

... ... . . . i . . . % m . ^ 

dence synthesis are as follows: 

t - Clearly specified* defensible & pxifiii criteria for inclusion 
>\ y of studies are established* 

- All published and unpublished studies which meet criteria are 
located and included, 

- where possible, effects sizes for included studies are com- 
puted. Effect size is operational ized as the meat* of all 
experimental-control differences on related measures divided 
by their standard deviations, 

- When effect sizes cannot be computed, effects of studies 
which meet inclusion criteria are characterized as positive, 
negative, or zero rather than excluded, 

- Apart from computation of effect size and use of well- speci- 
fied inclusion criteria, best-evidence syntheses are identi- 
cal to traditional narrative reviews. Individual studies and 
methodological and substantive issues are discussed in the 
detail typical of the best narrative reviews. 

The present paper is the first application of best-evidence 
synthesis. Since both meta-analytic and traditional narrative 
reviews exist in the area of ability grouping, this paper allows for 
a clear contrast between the methods and conclusions of best-evi- 
dence syntheses as opposed to meta-analytic and narrative reviews on 
the same topic. 
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^ Htafc i*. AhiJUtty .ftrpuping? 

/Sf"'". one important problem in discussing "ability grouping" is that 
K the term has many meanings. Several quite different programs or 
impolicies go under this heading. In general, ability grouping 
% implies some means of grouping students for instruction by ability 
3 or achievement so as to reduce their heterogeneity. However, vari- 
r oub grouping plans differ in ways likely to have a considerable 

impact on the outcomes of grouping. Some common forms of ability 
-:f'. grouping are described below. 

S. Ability SimsaA Sim Assignment, in this plan, students are 
f •• '■ assigned on the basis of ability or achievement to one self-con- 
} talned class (usually at the elementary level) or to one class which 
i. moves together from teacher to teacher, as in block scheduling in 
§ junior high schools, 

purriculum Tracking . A special form of ability grouped class 

assignment unique to the secondary level is curriculum tracking, 

I assignment of students by ability or achievement to tracks, such as 

college preparatory, general, or vocational. In secondary schools 

using such groupings, students may take all courses within their 

track, or may have some heterogeneously grouped classes. Typically, 
§• . ■ 
t, ability grouping within tracks is not done. 

I Specialized secondary schools (e.g., schools for the gifted, 

k, vocational schools) might be considered one form of curriculum 

f tracking. In Europe, different levels of secondary schools serve a 



similar tracking function. For example, in west Germany students 
planning to attend the ; university go to the g^|BiuiPf less highly 
skilled students attend the realSChule, and students preparing for 
vocations attend the bfiuptschule* 

p^rouninq iftc aea&Uis su Mathematics (Ability auu&ins £s* 

fleeted subjects ) . often, students are assigned to heterogeneous 
homeroom classes for part or most of the day, but are "regrouped" 
according to achievement level for one or more subjects. In the 
elementary grades, regrouping is often done for reading (and occa- 
sionally mathematics), where all students at a particular grade 
level have reading scheduled at the same time and are resorted from 
their heterogeneous homerooms into classes that are relatively homo- 
geneous in reading level. When regrouping for reading is done 
across grade levels, this is called the "Joplin Plan" (see below). 

■ i* « * 

In secondary schools students are often ability grouped for some 
subjects (e.g., mathematics) but not for others (e.g., social stu- 
dies) * Ability grouping for selected subjects in secondary schools 
may involve having higher- and lower-achieving sections of the same 
course, or may involve assigning students to different courses, as 
when ninth graders are assigned either to Algebra I or to General 
Mathematics. 

jpplin Plan . One special form of regrouping for reading is 
called the Joplin Plan (Floyd, 1954) , in which students are assigned 
to heterogeneous classes most of the day but are regrouped for read- 
ing across grade lines. For example, a reading class at the fifth 



grade, first semester reading level might include hi gli- achieving 
fourth graders, average achieving fifth graders, and low-achieving 
sixth graders. Reading group assignments are frequently reviewed, 
so that students may be reassigned to a different reading class if 
the performance warrants it. 

One important consequence of cross-grade grouping and flexible 
assignment is that reading classes contain only one or at most two 
reading groups, increasing the amount of time available for direct 
instruction over that typical of reading classes containing three or 
more reading classes. The Joplin Plan was principally an innovation 
of the late 1950's and early 1960's, after which time interest in 
cross-grade grouping turned more toward nongraded plans (see below) • 

popgraded Plans , The term "nongraded" or "ungraded" refers to a 
variety of related grouping plans. In its original conception 
(Goodlad and Anderson, 1963) , nongraded programs are ones in which 
grade-level designations are entirely removed, and students are 
placed in flexible groups according to their performance level, not 
their age. Full-scale nongraded plans might use team teaching, 
individualised instruction, learning centers, and other means of 
accommodating student differences in all academic subjects. Stu- 
dents in nongraded programs might complete the primary cycle (grades 
1-3) in two years, or may take four years to do so. The curriculum 
in each subject may be divided into levels (e.g., nine or twelve 
levels for the primary grades) through which students progress at 

t 

their own rates, picking up each year where they left off the previ- 
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pus year. This "continuous progress" aspect of nongrading give stu- 
dents a, feeling, that they are always moving forward} for example, 
rather than being assigned to the low reading group each year, a low 
achieving student simply progresses from level to level at a slower 
rate* 

Some of the no ngraded programs evaluated in the 1960*6 and early 
*70's did use the flexible, complex grouping arrangements envisioned 
by Goodlad and Anderson (1963) , while others did not. For example, 
several programs described by their authors as "nongraded" or 
"ungraded" were in fact virtually indistinguishable from the Joplin 
Plan* That is, students were assigned to heterogeneous classes most 
of the day but regrouped across grade lines for reading* One study 
(Morris, 1969) used "nongraded" to refer to a program in which stu- 
dents were regrouped for reading and math within grade levels, while 
another (Tobin, 1966) used "ungraded" to refer to a traditionally 
organized reading program in which high achievers were allowed to 
work on basal s above their nominal grade level. 

■ Special Classes lSH Msh Achiever p. m many elementary and sec- 
ondary schools, gifted, talented or otherwise superior students may 
be assigned to a special class for part or all of their school day, 
while other students remain in relatively heterogeneous classes. 

■ Special £l£fi££fi Ifii i&g Achievers. One of the most common forms 
of "ability grouping" is the assignment of students with learning 
problems to special or remedial classes for part or all of their 
school day. 



w^thin-ciaaE ftbility .SULimUtf* Regardless of the use or non-use 
o£ ability grouping of classes, most elementary teachers use some 
form of within-class ability grouping. The most common form of 
within-class ability grouping is the use of reading groups, where 
teachers assign students to one of a small number of groups (usually 
three) on the basis of reading level. These groups work on differ* 
ent materials at rates unique to their needs and abilities. Similar 
methods are often used in mathematics* where there may be two or 
more math groups operating at different levels and rates. 

In another common form of within-class ability grouping in ele- 
mentary mathematics, the teacher presents a lessor, to the class as a 
whole, and afterwards, while the students are Working problems, the 
teacher provides enrichment or extension to a high-achieving group, 
remediation or re-explanation to low achievers, and something in 
between to average achievers. 

Group-paced mastery learning (Bloom, 1976) may be seen as one 
form of flexible within-class ability grouping, in that students are 
grouped after each lesson into "masters" and "non-masters" groups on 
the basis of a formative test. Non-masters receive corrective 
instruction while masters do enrichment activities. Finally, indi- 
vidualised or continuous-pr ogress instruction may be seen as extreme 
forms of ability grouping, as each student may be in a unique "abil- 
ity group" of one. 

Theoretical advantages Disadvantages Ability grouping 



Similar lists of advantages ana disadvantages of ability grouping 
have been given theorists eM reviewer^ f ifty years 

(see, for example, Billett, 1932| Borg, 1965> Espo8ito r 1971> rind- 
ley « Bryan, 197 Or Good fi Marshall, 198fr Heathers, 196$; NBA, 1968) 
Miller a Otto, 1930). Ability grouping is supposed to increase stu- 
dent achievement primarily by reducing the heterogeneity of the 
class or instructional group, making it more possible for the 
teacher to provide instruction that is neither too easy nor too hard 
for most students. Ability grouping is assumed to allow the teacher 
to increase the pace and level of instruction for high achievers and 
provide more individual attention, repetition, and review for low 
achievers. It is supposed to provide a spur to high achievers by 
making them work harder to succeed, and to place success within the 
grasp of low achievers, who are protected from having to compete 
with more able agemates (Atkinson & O'Connor, 1963).* 

* 

The principal arguments against ability grouping have to do with 
the fact that this practice must create classes or groups of low 
achievers* These students are deprived of the example and stimula- 
tion provided by high achievers, and the fact of being labeled and 
assigned to a low group is held to communicate low expectations for 
students which may be self-fulfilling (see, for example, Good ft 
Marshall, 1984; Persell, 1977) . 

Further, homogeneously low performing reading groups (Allington, 
1980; Barr, 1975) and classes (Oakes, 1985; Evertson, 1982) have 
been observed to experience a slower pace and lower quality of 




instruction than dp students in higher achieving groups. A lack of 
appropriately behav i na models may 1 ea id o i . : fbehajter.^ cp ntag ion" 
among homogeneously grouped low achievers (Felmlee a fide r> 1983) # so 
that these groups may spend less time on task than other groups. 

However, perhaps the most compelling argument against ability 
grouping has little to do with its effects on achievement. This is 
that ability grouping goes against our democratic ideals by creating 
academic elites (Persell, 1977; Rosenbaum, 1976; Sorensen, 1970). 
According to this line of reasoning, all students need opportunities 
to interact with a wide range of peers. Because ability groupings 
often parallel social class and ethnic groupings, disproportionately 
placing low SBS, Black, and Hispanic students in low tracks (e.g., 
Rist, 1970; Haller & Davis, 1980; Heyns, 1974), the use of ability 
grouping may serve to increase divisions along class, race, and eth- 
nic group lines (see Rosenbaum, 1980). 

^apprehensive Ability Grouping i& ite Elementary ssha&l 

This review focuses on research on ability grouping at the ele- 
mentary level. This restriction is made primarily because so many 
characteristics of elementary schools and the students they serve 
are unique to this level of schooling. Also, this review focuses on 
comprehensive ability grouping plans, which involve all students at 
particular grade levels. 

This excludes studies of special classes for the gifted (e.g., 
Atkinson & O'Connor, 1963) and for low achievers. Gifted and spe- 
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cial education program? may be conceived of as one form of ability 
grouping^ but they also involve many other changes, in curriculum* ,. 
class size, resources, and goals that make them fundamentally dif- 
ferent from comprehensive ability grouping" plans. Further, non-ran- 
domised evaluations of gifted and special education/mainstr earning 
studies suffer from serious problems of selection bias which are 
less problematic in similar studies of comprehensive ability group- 
ing plans (Slavin, 1984a; Madden & Slavin, 1983). For reviews of 
research on gifted and accelerated programs see J. Kulik & C. L. 
Kulik, 1984 or Passow, 197 9; for special education/mainstr earning see 
Leinhardt & Pallay, 1982 or Madden & Slavin, 1983. 

One of the mpst important characteristics of elementary schools 
for comprehensive ability grouping is that they tend to be small, 
rarely having more than three classes at each grade level. This 
means that if ability grouping is done within grade levels, the 
resulting reduction in heterogeneity may be slight. In fact, sev- 
eral studies (e.g., Clarke, 1958; Balow, 1962; Balow & Curtin, 1966; 
Goodlad & Anderson, 1963) have demonstrated that grouping students 
within grades into two or three homogeneous groups brings about a 
minimal reduction in total heterogeneity, particularly if grouping 
is done on the basis of XQ or general achievement. 

Another important ftcWre of elementary schools is that students 
are traditionally taught in self-contained classes, remaining with 
the same teacher all or most of the school day, and correspondingly 
teachers must attend to only one class. This situation is conducive 



il- 
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to the use of ujm&Om Bbili^^ 

^roup*)>/:.i^ 

is much lees able to use within-class ability grouping, principally 
because the number of preparations required to teach twelve or more 
different subgroups would requi re superhuman effort. 

A third characteristic of elementary schools is that while stu- 
dents at the elementary level have widely diverse skills from the 
first days of school, they are much less heterogeneous than are stu- 
dents at the secondary level (see, for example, Coleman, Campbell, 
Hobson, McPartland, Mood, weinfeld, and York, 1966) . Perhaps for 
.this reason, between class ability grouping is far less universal in 
elementary than in secondary schools. 

! There are differences in the curriculum and goals of elementary 
And secondary schools which have an important bearing on ability 
grouping. By far the most important goal of the elementary school 
is to ensure that all students are able to read and compute. 

Reading and mathematics are subjects that, at least in theory, 
lend themselves especially well to homogeneous groupings, as they 
are hierarchically organized subjects in which the learning of one 
skill depends on mastery of earlier skills. In a heterogeneous 
reading class it is unlikely that a single level of basal reader 
could be used, as.it is probably unrealistic to expect low achievers 
to read and understand material a grade level or * above their 
reading level or to expect high achievers to prof om material a 
grade level or more below their reading level ilarly, it is 
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difficult to give an effective mathematics lesson to a class which 
includes some students who have not mastered subtract ion* mul tipl i- 
cation, or simple division, and some who have already mastered divi- 
sion or could do so very rapidly. 

Comprehensive, ability grouping plans used in elementary schools 
are adapted to the unique characteristics of that level of school- 
ing, and often have no correlaries at the secondary level. There 
are three principal factors at issue in elementary ability groupings 
1) whether ability grouping is done within- or be tween-cl asses (or 
both) i 2) whether between-class ability grouping is done to assign 
students to relatively abil ity- homogeneous self-contained classes, 
or is done only for selected subjects, with students remaining in 
heterogeneous classes most of their school day; and 3) whether abil- 
ity grouping is restricted to one grade level or may combine stu- 
dents of similar performance level regardless of grade level. 

Five comprehensive ability grouping plans predominate both in the 
literature and in practice: ability grouped class assignment, 
regrouping for reading and/or mathematics, Joplin and nongraded 
plans, and within-class ability grouping (reading or math groups 
within the class) • The relationship between the three factors 
listed above and the five principal comprehensive grouping plans is 
depicted in Figure 1. 



Figure 1 Here 



-13- 



21 



It ifl important to note that other combinations of the same fac- 
tors can produce additional grouping pi ansi and such plans hove 
sometimes been studied. For example* ability grouped class assign- 
ment is usually done within grades, but Rankin, Anderson, * Bergman 
(i$36) evaluated a "vertical grouping" plan in which upper elemen- 
tary students were assigned to classes on the basis of IQ without 
regard to grade level lines, cross-grade combination classes (e.g., 
3-4, 4-5) often resemble vertical grouping, as students assigned to 
such classes are usually the higher achievers in the lower grade and 
the lower achievers in the higher grade. However, cross-grade 
groupings have rarely been studied, as they usually result from 
administrative needs for equal-sized classrooms rather than from a 
plan to improve school organization, ^or example, When a principal 
has forty-five fifth graders and forty-five fourth graders a combi- 
nation 4-5 class is a likely result. 

Also, it is important to note that many elementary classes u^e 
between and within-class grouping. For example, reading groups in 
the primary grades are virtually universal, regardless of whether or 
not the classes are grouped by ability. On the other hand, some 
between-class ability grouping plans (especially the Joplin Plan) 
are explicitly designed to reduce or eliminate the need for grouping 
within the classroom. 

The following sections discuss the research on comprehensive 
ability grouping in elementary schools according to the four princi- 
pal categories discussed above. Each section contains a table sum- 



marizlng the principal studies on the ability grouping strategy 

i ^J^d an a discusBion of the studies and the methodolog^^^ 
cal and substantive Issues they se • The criteria for inclusion 
of studies are presented below, 

£ Aula jtai JSULttfly inclusion 

The studies on Which this review is based had to meet a set of j 
pr^pri criteria with respect to germaneness and methodological ade- 
quacy. As stated earlier, all studies had to involve comprehensive 
studies of ability grouping in elementary schools (grades 1-6) • 
Eur ope i n studies of eleven and twelve year olds who were in second- 
ary schools (e.g., Douglas, 1973) are excluded, even though studies 
of students of the same age in elementary schools were included. 
Studies of within-class ability grouping were included, but other 
programs related to within-class grouping were excluded* Examples 
of such excluded programs are mastery learning, individualized and 
continuous- progress instruction, cooperative learning, multi-age 
grouping not done for the purpose of reducing student heterogeneity 
open classrooms, and team teaching. No restrictions were placed on 
year of publication, and every effort was made to locate disserta- 
tions and other unpublished documents relating to ability grouping. 

Methodological Requirements Isx Inclusion, one key element of 

best-evidence synthesis is the £ priori establishment of inclusion 
criteria based on substantive and methodological adequacy. In the 
present case, criteria were established as follows: 



1 . Abil ity grouped classes were compared to heterogenepuely 
^|OMpej; CCJfc* ;X|l#98e8> l^^\i^V^ngjgg^ 

which compared achievement gains in experimental classes to "pre- 
dicted* gains (e.g. , Ramsey, 1962) and studies which correlated 
"degree of heterogeneity" with achievement gains without identifying 
classes as ability-grouped or heterogeneous (e.g. , Leiter, 1983). 

4 

2. Achievement data from standardized achievement tests were pre- 
sented. This excluded scores of anecdotal accounts and several stu- 
dies of student or teacher attitudes toward ability grouping. 

3. initial comparability of samples was established by use of 
random assignment, matching of classes, or matching of students 
within equivalent classes. In cases of matching of classes or stu- 
dents, evidence had to be presented which established that the 
classes were in fact initially equivalent in XQ or achievement le-*el 
(within 20% of a standard deviation). Studies in which experimental 
and control classes were not initially equivalent but gain scores or 
analyses of covariance were used to adjust scores for these differ- 
ences (e.g., Moorhouse, 1964) are* listed in tables in a separate 
category, and results of these studies should be interpreted cau- 
tiously. 

Several cross- sectional studies that provided little evidence of 
initial equality were excluded. For example, Powell (1964) compared 
achievement scores of one school using the Joplin Plan to another 
using a self-contained model, with no evidence that the two schools 
were in fact comparable. Some studies (e.g., Hart, 1959) compared 



achievement under ability grouping to that unfler heterogeneous 
grouping in earlier year s in the same schools* Such studies were 
included if there was evidence that the samples in the earlier years 
were equivalent in abil ity or achievement. However, comparison with 
previous classes was limited to two years, on the assumption that 
too many unrelated changes could take place over longer periods* 
This excluded one study that made a comparison over a ten-year 
period (Cushenberry, 1964! and restricted attention to the firet two 
years of an eight-year study by Tobin (1966)* 

4. Ability grouping was in place for at least a semester. This 
requirement excluded only one very brief study (Piland & Lemke, 
1971). 

5. At least three experimental and three control teachers were 
involved in all included studies. The purpose of this requirement 
was to minimize the influence of teacher and class effects in small 
studies (see Slav in, 1984b) on study outcomes. This caused a few 
very small studies to be excluded (e.g., Johnston, 1973; Putbrese, 
1972; Williams, 1966) . 

Literature ssassh Procedures < [ 

\ 

The studies reviewed here were located in an extensive search. 
Principal sources included the Education Resources Information Cen- 
ter (ERIC), Psychological Abstracts, Education Index, and Disserta- 
tion Abstracts. In these sources, the keywords "ability grouping," 
"classroom organization," "Joplin Plan," "nongraded, " and related 



descriptions produced hundreds of citations. In addition, all cits- 

§tjtje*e^ ciu - - oos 

made in .primary sources were followed up. Every attempt was made to 
o^jn a oomplete s*t of published and unpublished studies which met 

substantive ^ criteria outlined above. Fur- 

ther, in a few cases where clarif icat^n^, were needed about impor- 
tant studies, authors were contacted ^X,^tly for additional inf or- 
mation. 

MP »fcatton of Effect Sizes 

Throughout this review, effects of various ability grouping stra- 
tegies are referred to in terms of effect size. Effect sizes were 
generally computed as the difference between the experimental and 
control means divided by the. control standard deviation < Glass, 
Hcdaw, & smith, 1981). The control group was always the heterogene- 
ous grouping plan unless otherwise noted, so that a positive effect 
size implies greater learning in an ability grouped plan and a nega- 
tive value indicates an advantage for heterogeneous grouping. When 
means or standard deviations were omitted in studies which met 
inclusion criteria, effect sizes were estimated when possible from 
t's, F's, or exact p values (see Glass et al,, 1981). 

Many of the studies in this review presented data on gain scores 
without presenting pre- or posttest data. Effect sizes from 

ft 

achievement gain scores are typically inflated, as standard devia- 
tions of gain scores are less than those of pre- or posttest scores 
to the degree that pre-post correlations exceed 0.5. If pre-post 
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correlations ace known, effect sizes from gain scores can be trans- 
formed to the scale of post test values using the following multi- 
pliers 

ES - <ES ) ( V 2 U-r > 

gain pre- post . 

Because few studies presenting gain scores also provide pre-post 
correlations, a pre-post correlation of +0,8 was assumed. This fig- 
ure is a characteristic correlation between fall and spring scores 
on alternate forms of the California Achievement Test in the upper 
elementary grades (CTB/McG raw-Hill, 1979). Substituting 0.8 in the 
formula, a multiplier of 0.632 is derived, which was used to deflate 
effect size estimates from gain score data. Because this value is 
only a rough approximation, effect sizes from gain score data should 
be interpreted with even more caution than is warranted for effect 
sizes in general. . 

In studies in which pretest data were provided, effect sizes were 
computed as .gain scores, divided by the control group 1 s post-test 
standard deviation. This procedure adjusts effect sizes for any 
differences in pretest scores. In a few cases, pretest and posttest 
scores were from different tests. In these studies (e.g., Flair, 
1964) , experimental-control differences divided by control standard 
deviations were computed for pre- and posttests, and the difference 
between these is reported as the study's effect size. Since all 
studies which met inclusion criteria presented either gain scores or 
pre- and post-test scores or matched on pretests, all effect sizes 
were adjusted for initial starting points. 
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If studies did not present enough data to allow for computation 
. ; :,of .,eff:ect.,,sis.e v tu.t. otherwise met criteria . £^4j&jgi} on, ., they ^ers 
included in tables with an indication of the dixection and consis- 
tentcy of any achievement differences* In some cases only grade 
equival ent differences were given, and these are presented in the 
table* Because the standard deviations of grade equivalents are 
around 1.0 in upper elementary school, grade equivalent differences 
may be considered very rough approximations of effect size* 

In general, one overall effect size is presented for each study, 
unless two or more different ability grouping plans were compared to 
heterogeneous control groups in the same study (e.g. , Cartwright and 
Mcintosh, 1972) or two distinct samples were studied (e.g., Borg, 
1965) • Multiple effects within a study were averaged to obtain the 
overall effect size estimate (see Banger t-Drowns, in press) . If 
studies presented adequate data, overall effect sizes were also bro- 
ken down by subject (e.g., reading, mathematics) and by achievement 
or ability level. Effect sizes by ability level should be inter- 
preted with particular caution, as they are often inflated because 
standard deviations within subgroup categories are restricted in 
range. 

In this best-evidence synthesis, every effort was made to make 
each effect size be a meaningful representation of the effect of 
ability grouping on student posttest achievement, holding the post- 
test standard deviation as the common metric. Mn all tables, ran- 
domized studies are listed first, followed by matched studies pre- 



•tilting evidence off initial equality between experimental and 
control groups* and then matched studies lacking evidence of initial 
equality* Within categories, studies with the largest samples sizes 
are listed first. These procedures mean that effect sizes from stu" 
dies listed earl ier in each table should generally be given more 
weight than those listed later. 

However, it is important to remember that any effect size is only 
a rough indicator of the effect of a treatment. Many factors may 
influence effect size, such as differences in subjects, measures, 
experimental procedures, and study durations* Often, there ace sub- 
stantial nonsystematic differences in effect sizes for subgroups or 
for similar measures within the same study* For example, one study 
by Breidenstine (1936) had a mean effect size of -.08, but effects 
at particular grade levels ranged from -.89 to +.54* In another 
study by Slavin and Rarweit (1985), effects of within-class ability 

\ . . .... 

grouping were +.64 for mathematics computations but .00 for math er 
matics concepts and applications. Had Breidenstine (1936) studied 
only one grade level or had Slavin and Karweit (1985) used only one 
mathematics achievement measure, their results would have appeared 
quite different. For these reasons, effect size data should always 
be interpreted cautiously, in light of the quality and consistency 
of the studies from which they were derived. ,•• 

a 
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peeearch op ^m^rehehBiv^ Ability Grouping 

Si.;-. : 

%Km2J& S*23B£& £Xm ABBlqnment 

3 m - : % ; ;totairof fourteen stuaiee ^bf ^ropreheneive ability grouped 
class assignment plans were located. 

The major characteristics and findings of the fourteen studies 
are summarized in Table 1. The randomized study is listed first, 
followed by matched studies in descending order of sample size. 



Table 1 Here 



31 inspection of TaJ>le 1 clearly indicates that the effects of com- 
S prehensive ability grouped class assignment on student achievement 
are zero. The median total effect size across the seventeen compar- 
isons in fourteen studies is exactly zero, and the effect sizes 
¥ cluster closely around this value; of thirteen comparisons which 

% yielded effect size data, eight fell in the range -.10 to +.10, and 

$■ 

eleven in the range -.15 to +.15. Effect sizes for reading and 
i mathematics did not exhibit any pattern different from that for ove 
rail effects. Further, little support appears in Table 1 for the 
assertion chat high achievers benefit from ability grouping while 
low achievers suffer. Three studies (Borg, 1965; Flair, 1964; 
Tobin, 1966) found such a pattern, but three others (Bremer, 1958; 



Hartill, 1936; Morganstern, 1963) found just the opposite, and Bark- 
er-j^n^ Looker <1#S2? j ^jg^J^^n.^ 

ai. (1936) found no differences according to achievement level, 

,.• , 3foe. v on.iy.^^ of, abil ity <gr oupe d cl ass assi gnraent is 

one jay Cartwr ight and Macintosh ( 1972) # who cpmpa r ed three gr oupi ng 
methods in a school in Honolulu attended by disadvantaged students 
from a housing project. The students were ethnically diverse, and 
most came to school speaking Pidgin English and had to learn stan- 
dard English as a second language. Students in grades 1-2 were ran- 
domly assigned to one of three treatments: Self-contained heteroge- 
neous grouping, self-contained ability grouping, and flexible. The 
ability grouped students were assigned to relatively homogeneous 
classes according to intellectual ability and reading achievement 
without regard for grade level, so that the individual classes were 
somewhat heterogeneous in chronological age. The flexible classes 
were grouped for various subjects according to their performance 
level in those subjects, again without regard for grade level, and 
were frequently regrouped as their progress during the year war- 
ranted. All three treatments were begun when students entered the 
first and second grades and were continued for two years. 

The dependent measures were scores on the Metropolitan Achieve- 
ment Test, As shown in Table 1, the heterogeneous classes had 
higher scores in reading (ES ■ -.17) and in mathematics (ES*-.52) 
than the ability grouped classes. The heterogeneous classes also 
achieved more in reading than the flexible classes (ES «-.28), but 
there were no differences in mathematics. 



Even though the Cartwright and Macintosh (1972) study used random 

x^lve. evidence against the use of ability ^ grouped class assi^roent. 
First, there was only one class at each grade level in each treats 
ment, restricting the possibilities for reducing heterogeneity by v 
ability grouping even more than is usually the case in elementary 
ability grouping studies. Second, the population involved is quite 
atypical, and generalization to other settings, even other disadvan- 
taged schools, is difficult. 

Since there is only one randomized study of ability grouped class 
assignment and it has some important limitations, we must look at 
the best of the nonrandomized studies, those which used matching 
procedures to equate nonr andomly assigned groups and presented data 
to indicate that the groups were, in fact, initially equivalent in 
achievement or ability. 

Three large, longitudinal studies done in the 196Q's stand out in 
the study of ability grouped class assignment: Barker-lawn 1 s (1970) 
study of streaming in English and Welsh junior schools, the Goldberg 
et al. (1966) study of different grouping patterns in Mew York City 
schools, and Borg's (1965) study in two Utah school districts. 

Of these three studies, the Goldberg et al. (1966) study is per- 
haps t^e most remarkable. This study involved eighty- six grade five 
classes in forty-five New York City elementary schools. Principals 
of all New York schools submitted Otis 1Q distributions of their 
fourth grades. Only those schools with at least fou: students with 



iQ'e xrf at least 130 were included in the study earapje, which had 
the effect (according to the authors) of restricting the sample to 

schools were asked to assign students to classes for the fifth grade 
to confer* to any of fifteen grouping patterns, ranging from 
extremely narrow to extremely ^r pad. "5 
included students falling within one IQ decile (e.g., 120-130), or 
those restricted to IQ 130 and up or 99 and below. -Extremely 
broad" classes included a full range of students from below 99 to 
above 130. Between these extremes were various moderately narrow 
and moderately broad patterns, classes containing students in two to 
four contiguous IQ deciles. 

The principals were asked to keep students in the, designated 
grouping patterns for two years, throughout grades 5 and 6, and only 
those students who were in the same schools for the entire two-year 
period (79% of the original sample) were included in the data analy- 



ses, 



With classes in fifteen grouping patterns, Goldberg et al. were 
able to simulate many alternative grouping arrangements. For exam- 
ple, they could compare very homogeneous to very heterogeneous 
grouping plans by comparing the achievement gains of all students in 
one-decile classes to those in five-decile classes. They could 
simulate provision of special classes for the gifted by comparing 
five-decile (heterogeneous) classes to combinations of one-decile 
(130+) and four decile (less than 99 to 130) classes, and so on. 
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Unfortunately, Goldberg et al. do not present their actual 
achievement data, but only describe significant differences and pat- 
terns of findings. However, the patterns they present consistently 
favor broad, heterogeneous grouping plans for all students except 
•for the most gif ted ( 13 Q+) who di d equally well in broad- or nar row- 
range classes. Presence of gifted students was beneficial for the 
achievement of most students in most subject s, while the presence of 
low achievers was neither beneficial nor detrimental overall. 

The Goldberg et al. (1966) study is arguably the best evidence in 
existence against the possibility that reductions in IQ heterogene- 
ity can enhance student achievement in the upper elementary grades. 
The size and rigor of the experiment make it highly unlikely that 
any non- trivial positive effect of ability grouping could have been 
missed, while most achievement comparisons in the Goldberg et al, 
study were non-significant, the patterns of mean differences and of 
those differences which were statistically significant support het- 
erogeneous rather than ability grouped class assignments. 

The Barker-Lunn (1970) study in England and Jfales similarly pro- 
vides little support for ability grouped class assignment. This 
study compared the achievement gains of students in 36 streamed 
junior schools (serving students aged 7 to 11) and 36 unstreamed 
schools, matched on social class. The streamed schools had a slight 
advantage in achievement after one year, so the initial comparabil- 
ity of the samples was questioned by the author; the year-to-year 
scores and four-year longitudinal comparisons used first-year scores 
as covariates to control for this initial difference. 
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As in the case of the Goldberg et al. (1966) study, Barjcer-fciinn 

pr ettnta pi^jr ^aU MMWB^B- ;^ r « n % 

Overal 1 , there f were, noim^ningt^ tt '.en~ds. ^''tey^r Ing $tr ; ia^d"ot Jftgr. 
streamed schools. All but a few comparisons were non-significant, 
and those which were^ 

streamed or unst reamed schools. Again, If there were any consistent 
effect of ability grouped class assignment on student achievement, a 
study the sise and quality of Barker- Lunn's would be very likely to 
find it. 

The third large, longitudinal study is one by Borg (1965), who 
compared achievement gains in two adjacent districts in Utah, one of 
which used heterogeneous grouping, two elementary cohorts were 
studied. One began in the fourth grade and was followed through 
grade seven. Another began in the sixth grade and was followed 
through grade nine. The results for the sixth grade sample pre- 
sented in Table 1 are only for the first year, as these students 
went on to junior high school beginning in grade seven. 

Even though this study used as large a sample as Goldberg et al. 
(1966) and Barker-Lunn (1970) and was also carefully controlled, the 
nature of the samples involved make the results of this study less 
conclusive. First, only two districts were involved, and any dif- 
ferences between the districts other than the use of ability group- 
ing are completely confounded with grouping practice. One district 
served a small city, while the other served its outlying area, so 
unmeasured population differences may have been operating. Second, 



while the two districts 1 mean pretest scores were equal within each 

district using heterogeneous grouping, particularly in the sixth 
grade sample. Third, the districts involved had been using their 
respective grouping methods for many year s, so that the students 
being studied had been in ability grouped or heterogeneous classes; 
for three years (the grade four sample) or five years (the grade si; 
sample). Any effects of grouping may have already been registered 
before the study began. 

The results of the Borg (1965) study were inconsistent, but in 
general the longitudinal data following fourth graders indicated 
that ability grouping was beneficial for the achievement of high-IQ 
students, detrimental for that of low-lQ students, and neutral, for 
average- IQ students. After one year, high- and average- IQ students 
scored higher in ability-grouped than in heterogeneous classes, but 
by the seventh grade this difference had disappeared. High- and 
average achieving sixth graders gained more in ability grouped than 
in heterogeneous classes, but these differences also dissipated in 
junior high school. 

One well designed (but fifty year old) study by Hartill (1936) 
compared ability grouped to heterogeneous class assignment in fif- 
teen New York City schools. Students in grades five and six were 
assigned to ability grouped or heterogeneous classes for one semes- 
ter, and were then reassigned to classes according to the opposite 
grouping pattern for a semester. Not only were the students their 
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own controls (since they experienced both grouping plans] but they 
were individually matched with one another, so that the group which 
experienced ability grouping first was identical in IQ to that which 
experienced heterogeneous grouping first. The results indicated 
that low* IQ students achieved slightly better in ability grouped 
classes (BS - +.18), high-IQ students achieved slightly better in 
heterogeneous classes (ES -•12) , and average- IQ students achieved 
equally well in the two grouping plans. Overall, achievement gains 
"were identical in the ability grouped and heterogeneous classes. 

Another important early study was one by Rankin et al. (1936), 
who compared students matched on achievement level in three pro- 
grams. One was traditional ability grouped class assignment done 
within grade levels, except that in mathematics these classes used a 
program essentially identical to modern group-paced mastery learning 
(see Block & Anderson, 1976). Another, called "vertical grouping, " 
assigned students to classes according to their level of achievement 
without regard for their grade level. This procedure produced such 
homogeneous classes that reading groups within the classes were con- 
sidered unnecessary. The third plan involved heterogeneous grouping 
of classes, with the additional requirement that within-class abil- 
ity grouping (including use of reading groups) was not allowed. 
Teacher and administration attitudes toward this heterogeneous plan 
were quite negative, as the degree of heterogeneity in these classes 
was great and teachers were unable to use any form of grouping to 
accommodate student differences. However, achievement differences 
between the two ability grouped plans and the heterogeneous classes 
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were email for the ability grouped plan (ES « +»05) and for the 
i "vertical" plan (ES - +.07) . 

A four-year study of streamed and unstreamed junior schools in 

'fej/IiSft • : IW&SS '.. 9*W&$**^*$\1$t*§. M schools. Jfce^,.., 
schools themselves were selected from the same districts and were 

found to have nearly identical mean IQ's. Students were individu- 
ally matched on IQ within the schools. After three and one-half 
year 6, the students in the unstreamed schools were achieving at a 
significantly higher level than those in the streamed schools (ES ■ 
-.26). However, other multi-year studies (Breidenstine, 1936; 
Tobin, 1966) found effects near zero, and Morganstern, who followed 
5 _ students from the fourth to the sixth grade, found a small benefit 
of ability grouping (El ■ +.15). 

J Some authors (e.g., Good and Brophy, 1984) have suggested the use 

of a modified ability grouping plan in which high and average 
achievers are mixed and average and low achievers are mixed. How- 

i ! : 

ever, a study by Isomer (1962) found no achievement benefits of such 
a plan (ES « -.04) • 

In addition to the studies listed in Table 1, a few studies have 
correlated the degree of heterogeneity in classes with student 
achievement. Justman (1968) found that the reading achievement of 
third graders increased slightly more in heterogeneous than homoge- 
neous classes, with average and low achieving classes gaining the 
most from heterogeneity. Leiter (1983) found no correlation between 
/ class homogeneity and third grade reading and mathematics achieve- 



ment, controlling for the prev ious year 1 s scores, al though there was 
±jW^qniQ cant ; tr ewi toward higher, reading achieyeiiient and lower 
mathematics achievement in more homogeneous classes, Edminston and 
Benfer (1949) divided sixteen classes of fifth and sixth graders 
into classes with wide and narrow IQ ranges. Over ; six Imonths, stu- 
dents in the wide range classes gained significantly more in compo- 
site achievement than did students in narrow range classes. 

Summary and Discussion: Ability Grouped AfifijLsnmei&. Given 

the persistance of the practice over time and the belief teachers 
typically place in its effectiveness, it is (surprising to see how ' 
unequivocally the research evidence refutes the assertion that abil- 
ity grouped class assignment can increase student achievement in 
elementary schools. There is a considerable quantity of good qual- 
ity research on this topic, such that any impact of grouping on 
achievement would surely have been detected. 

Several earlier reviews have made the claim that ability grouping 
is beneficial for high-ability students and detrimental for low-a- 
bility students (e.g., Bash, 1961; Esposito, 1973? Begle, 1975). 
This claim is not clearly supported by the present review. It is 
possible that a clearer pattern emerges in secondary studies, but it 
is more likely that confusion arises when studies of special pro- 
grams for the gifted and for low achievers are included in ability 
grouping reviews. Studies of special programs for the gifted tend 
to find achievement benefits for the gifted students (J. Kulkik and 
C, L. Kulik, 1984; Passow, 1979), while studies of mainstreaming vs. 



specif education foe students with learning problems tend to favor 

including studies of ^spe ci al . " *^riiV>«^e ^fted an^ learning 
disabled in reviews of abil ity grouping, as was done by Begle 
U975) > Borg (;1$65) , Findley a Bryan (;n0) , and others, would .give 
the impression that ability grouping is beneficial for high achiev- 
ers and detrimental for low achievers. However, it is likely that 
characteristics of special accelerated programs for the gifted 
account for the effects of gifted programs, not the fact of separate 
grouping per se (see Pox, 1979) . Also, problems of selection bias 
in nonrandomized studies of programs for the gifted and for students 
With learning problems bias the results of these studies toward the 
higher placement, spuriously favoring separate programs for the 
gifted and mainstream placement for low achievers (see Borg, 1965? 
Slav in, 1984a). 

p»y romping £&X Altf Mathematics 

In many elementary schools, reading and/or mathematics is sched- 
uled at the same time for all students in a particular grade. At 
that time, students leave their heterogeneous homeroom classes to 
receive reading or mathematics instruction in a class that is more 
homogeneous in the skills in question. 

Previous reviews and meta-analyses (e.g., Borg, 1965? Findley & 
Bryan, 1970? C. L. Kulik and J. Kulik, 1984) have not made a clear 
distinction between regrouping* and ability grouped class assignment. 
Yet there ate several important theoretical reasons to do so. 
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First, regrouping minimizes the presumed negative psychological 
effects of ability grouping. Students spend most of the day in het- 
erogeneous classes, with which they almost certainly identify, sec- 
ond, regrouping is always done on the basis of actual performance in 
reading or mathematics, not on IQ, and is usually flexible, so that 
any errors in assignment or changes in student level of achievement 
can be easily accommodated by moving students to different sections. 
For these reasons, it is likely that regrouping can produce much 
more homogeneity in the skills being taught than can ability grouped 
class assignment, which is usually based on IQ or general achieve- 
ment and is relatively inflexible. 

Unfortunately, there is neither the number nor the quality of 
studies of regrouping to enable definitive conclusions concerning 
the effectiveness of such plans. Only three studies used matching 
and presented evidence of initial equality. Four additional studies 
lacked evidence of initial equality but did adjust posttests for 
pretests and other variables. Overall, five of the seven studies 
found that students learned more in regrouped than in heterogeneous 
classes, while two found the opposite trend. (See Table 2.) 



Table 2 Here 



Two of the studies investigated the practice of regrouping for 
reading only. One of these was a large study by Hoses (1966) 



involving 54 classes in rural Louisiana, This carefully controlled 
study held constant time and instructional materials in matched 
experimental and control classes. No consistent differences were 
found in reading achievement. However, a study by Berkun, Swanson, 
6 Sawyer (1966) did find significantly greater gains for regrouped 
than for self-contained reading classes (ES ■ +.32) • However, this 
article provides few details of the treatment procedures, and may 
suffer from pretest differences between the experimental and control 
groups (all data presented are posttests adjusted for pretests). 

A study by Provus (1960) of regrouping for mathematics provides 
the best evidence in favor of this practice. Experimental students 
in eleven classes in a suburb of Chicago were regrouped from their 
heterogeneous homerooms into relatively homogeneous mathematics 
classes at the same grade level. Achievement gains for students in 
these classes were compared to those of students matched on 1Q who 
remained in heterogeneous classes all day. One effect of the 
regrouping was to allow high achievers to be exposed to material far 
above th *ade level; there were cases of fourth graders finish- 
ing the ye working on eighth grade material. Perhaps for this 
reason, achievement gains for high ability students in the regroup- 
ing program were much greater than those of comparable control stu- 
dents (ES ■ +.79) , but the program was less spectacularly beneficial 
for average ability (ES ■ +.22) and low ability students (ES * 
+.15) . 

♦ 
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In contrast, Davis & Tracy (1963) found that regrouping for 
mathematics was detrimental to the achievement of students in a 
rural North Carolina town. However f":this s^udy "compared only two 
schools and there were substantial achievement differences at pre- 
test* Also, it is impor tant J;c note, *hajt : no M£im&M&$*te:,. to . 
provide differentiated materials to the regrouped classes; all 
classes used grade lev el -appropriate texts. 

Finally, three studies investigated the effects of regrouping in 
multiple subjects. In a study by Koontz (1961) in Norfolk, Virgi- 
nia, experimental students were separately grouped according to 
their achievement in reading, mathematics, and language. At other 
times students remained in "intact classes," but it is unclear 
whether these were ability grouped or heterogeneous. This method 
approaches a departmentalized arrangement, as students changed 
classes three or four times each day. Its effects on all three sub- 
jects involved turned cut to be negative, particularly for reading, 
where the heterogeneous, self-contained classes gained .42 grade 
equivalents more than the regrouped students. A study by Balow and 
Rudell (1963) evaluated regrouping for reading and math, and found 
positive effects in both subjects for average and low achievers. 
However, pretest differences favoring the experimental (regrouped) 
classes throw some doubt on these findings. 

Finally, Morris (1969) studied a program in which regrouping was 
done for reading and math. The program was called a "nongraded pri- 
mary plan" by the author, but since regrouping was done within grade 



levels* it was categorized as a regrouping program. Overall student 
achievement at the end of three years was ..higher in the regrouped 
cl asses than in heterogeneous control, groups* controlling for IQ <ES 
■ +.43)% After two more years during which all students* experimen- 
tal as well as control * were in a regrouping plan* -the -former -exper- 
imental students had greatly increased their advantage over the con- 
trol group (ES « +1 .20). 

Summary .and Discussion: Regrouping lax &£&&iw Mathematics. 
Overall* the results of studies of regrouping for reading and mathe- 
matics are inconclusive. None of the grouping patterns evaluated 
were consistently successful* although one study (Provus* 1960) gave 
strong evidence favoring the .use ' regrouping in mathematics if 
students are given materials appropriate to their levels of perfor- 
mance. Another study (Morris* 1969) found strong positive effects 
of regrouping for reading and mathematics. This study also empha- 
sized adaptation of the level of instruction to accommodate student 
differences. In contrast to the situation with ability grouped 
class assignment* where there is adequate high quality evidence to 
conclude that no important effects of ability grouping exist* it is 
still quite possible that regrouping for one or two subjects is 
instructionally effective* and evidence from studies of Joplin and 
nongraded plans* summarized in the following section* provides some 
support for this possibility. However* more research is needed to 
establish the achievement effects of regrouping within grade levels. 
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Jpp lin and ffrpgraded Plana 

aion of regrouping for reading to allow for grouping by reading 
level acroei grade level lines. This practice typically creates 
reading classes in which ail students are Working at thf Same or at 
most two reading levels, so that with in- wl««« aoility grouping may 
be reduced or eliminated. The tradeoff between within-class (read- 
ing groups) and between-class (Joplin) ability grouping is a pivotal 
issue in studies of the Joplin plan, which may be conceptualized not 
as ability grouping versus heterogeneous grouping but as between- 
vereus within-class grouping (see, for example, Newport, 1967) . In 
contrast, studies of regrouping for reading within grades maintain 
reading groups within the class, although there may be some reduc- 
tion in the number of reading groups used. 

Nongraded plans share with the Joplin plan the idea of grouping 
students according to performance level in a specific skill, ignor- 
ing grade level or age. Some forms of nongraded grouping are very 
similar to the Joplin plan, except that they are applied in the pri 
mary rather than intermediate grades and have been utilized in sub- 
jects other than reading. Some nongraded plans incorporated the 
practice of allowing students to spend two or four years in the pri 
mary grades if their progress warranted acceleration or additional 

i 

time, respectively, but it is unclear how often students actually 
deviated from the three-year norm. One study (HcLoughlin, 1970) 
found that students in nongraded plans hardly ever completed the 
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primary grades in less than three years and took four years to dp so 

; ;^..|ioc..t^ten...tban did students i& ffjgM 



Table 3 Here 



Table 3 summarizes the research on the Joplin Plan and Joplin- 
like nongraded plans. The Table includes several studies (e.g., 
Hillson et al., 1964) whose experimental groups were described as 
nongraded or ungraded but which more closely resembled the Joplin 
Plan, in that only one subject (usually reading) was involved. The 
studies listed in Table 4 evaluated more comprehensive nongraded 
pi a. Involving several subjects and such additional features as 
team teaching, individualized instruction, and learning centers* 
However, it should be noted that the division of studies of non- 
graded plans into Tables 3 and 4 is not exact. Several studies do 
not adequately describe their nongraded plans, and others vary on a 
continuum from completely Joplin-like (e.g., Skapski, 1960) to 
highly complex flexible grouping plans (e.g., Bowman, 1971). As ' " 
nongraded plans incorporate more of the features proposed by Goodlad 
and Anderson (1963), they cease to be just ability grouping plans, 
but come to resemble forms of the open classroom (Giaconia and 
Hedges, 1982) or of Individually Guided Education (Klausmeier, Ross- 
miller, & Saily, 1977) . 



Overall, the evidence in table 3 et r ongly euppor M the use °^ ^ 
joplin Wan. Joolin classes achieved roo^ 

finding no differences, The jwM^ 

using random assignment and ten using :i*f\fe^ 
initial equality. .,. 

Morgan and S tucker (1960) conducted a randomiaed study of the 
jpplin Plan in rural Michigan. Fifth 

matched on reading achievement and then r*n*^ly ;ass|gn#l ^ ^four • 
jppiin and f our coi^rol class^. Teachers w€»r* . also ra^omly j 
assigned to treatments. Because there were only two Jopl^n class^s^ 
at each grade level, the amount of cross-grade grouping £&at could 
be done was limited, and control groups were ability gr duped (within 
grade) , yet the authors still document a considerabie reduction in 
class heterogeneity as a result of cross-grade assignment. 

Results indicated significantly higher achievement in the Joplin 
Plan for high and low achievers in fifth grades and low achievers in 
the sixth grades. The authors explain the failure to find expert 
mental-control differences for high achieving sixth grades by noting 
that because of the small number of classes involved in the study, 
high achieving sixth graders could not be accelerated as much as 
would have been possible with larger numbers of classes. Whatever 
the explanation, larger experimental-control differences for low 
achievers (ES - +.94) than for high achievers (BS » +.32) are 
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entirely due to the lack of differential gains for high achieving 
sixth grader*. • 

Hill eon* Jones, Moore, and Van Deyender (1964) studied a non- 
graded iprj^ in Plan, randomly assigned ; 
students and tGachers to no Students 
in the nongraded classes were assigned to heterogeneous classes but 
regrouped across grade levels for reading* They proceeded through 
nine reading level s, and were Continually regrouped on the basis of 
their reading performance. Within each reading class teachers had 
multiple reading groups and used traditional basal readers and 
instructional methods (J. W. Moore, Personal Communication! January 
23/1986). 

The results of this study supported the efficacy of the nongraded 
program. After three semesters, reading scores for experimental 
students on three standardized scales were considerably higher than 
for control students (ES ■ +.72, or about .41 grade equivalents) • 
After three years in the program, experimental -control differences 
had diminished, but were still moderately positive (ES « +,33) 
(Jones, Moore, and Van Devender, 1967) • 

Ten studies compared Joplin or Jopl in-like nongraded classes to 
matched control classes and presented evidence of initial compar- 
ability. The largest of these (Russell, 1946) was done before Floyd 
(1954) first described the Joplin plan, but evaluated a very similar 
intervention. Students in grades 4-6 were regrouped for reading 
without regard to grade level. This created relatively homogeneous 
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groups* but homogeneity was increased still further by the use of 
. reeding groups within the classes (usually two) and ^ rev fwing and 
modifying group assignments four times per yearn Students in this 
plan* called "ci rcl ing, " were matched with students in other schools 
which did not regroup for reading and followed for two years, from-* 
the beginning of grade 4 to the, beginning of grade 6. Results indi- 
cated no differences between the two types of grouping plans (ES « 
•00). 

It is interesting to note that the only other matched equivalent 
study to find no advantage for, the Joplin plan also used reading 
groups within the regrouped reading classes. This was a study by 
Carson and Thompson (1964), in which students in grades 4-6 were 
regrouped across grade lines for reading but were still assigned to 
reading groups within their reading classes. These students* gains 
in reading achievement were compared to those of students assigned 
by ability (within grade) to self-contained classes. 

The eight remaining matched equivalent studies all found positive 
effects of Joplin or Jopl in-like nongraded plans on student achieve- 
ment. For example. Green and Riley (1963) compared the Joplin Plan 
to the traditional methods in use in the same schools during the 
previous year. Students in the Joplin classes gained significantly 
more in reading achievement than did students in the earlier years 
(ES « 4.36) . 

In a study by Hart (1959) « grade 4-5 students were regrouped into 
nine reading classes. Seven of these had only one reading level, 
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one had two, and one had four very low achieving groups (but may 
have used fewer than four reading groups in the classroom) • The top 
class was reading at the seventh grade iavel V and the bottom class 
contained students ranging from primer to second grade, second 
^semester . Students* scores were compared to those -of students 
taught by the same teachers the previous year. Gains on the Cali- 
fornia Achievement Test strongly favored the Joplin approach (ES ■ 
+•89). in both the fourth and the fifth grades, Joplin groups 
gained about a full grade equivalent more than did heterogeneously 
grouped classes in earlier years. 

Rothrock (1961) also compared Joplin Plan classes to heterogene- 
ous classes which used within-class ability grouping, and found sig- 
nificantly positive effects on student reading achievement and 
work-study skills, averaging .44 grade equivalents more than in het- 
erogeneous classes. An individualized reading program fell between 
the joplin and heterogeneous programs in achievement effects. Green 
and Riley (1963) found consistently greater reading achievement 
gains in Joplin Plan classes than in matched heterogeneously grouped 
classes in different schools (ES ■ +.36) • Every experimental class 
gained signif icautly more than its corresponding control class, and 
the average experimental-control difference in grade equivalents was 
.54. Anastasiow (1968) found no significant differences between a 
jopl in-type regrouping plan and heterogeneous grouping, but the 
trends favored the Joplin Plan groups (ES « +.15). 
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Moorhouae (1964) compared a school using the Joplin Plan to a 
heterogeneously grouped control school. The students in grades 4-6 
were grouped in seven reading classes. Three classes contained one 
reading level, three contained two, and one (the lowest) contained 
students from five levels, with a range tfrom first to third grade 
levels. The top class had students working at the seventh and 
eighth grade level. The authors note that three quarters of all 
students in the Joplin classes were working at a level different 
from the ones usually used at their grade level, unfortunately, the 
results of the Moor ho use study are marred by pretest differences 
favoring the control groups in grades 4 and 6. However, at all 
three grade levels (including grade five, where there were no pre- 
test differences) , Joplin classes gained considerably more in read- 
ing achievement than heterogeneous control classes, averaging gains 
of 1.24 grade equivalents in one "semester, more than twice the gains 
seen in control classes (.61 GE) • 

Experimental and control classes were followed for a total of 
five semesters. By the end of sixth grade, fourth graders had 
gained a total of .50 grade equivalents more than control. Fifth 
graders had gained about .40 grade equivalents by the end of sixth 
grade, but lost this advantage by the eighth grade. Sixth graders, 
who made the greatest gains initially, maintained most of that gain 
through the eighth grade. Overall, the patterns of results indicate 
that achievement gains due to the Joplin Plan were primarily seen 
early in the program implementation and then diminished as students 
entered the junior high school. ^ 

♦ 
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Ingram (I960) evaluated a nongraded program very similar to that 

assigned to one of nine reading levels without regard to age, As 
students moved from grade to grade, they picked up where they had 
left off in the prev ious year • Teachers generally had more than one 
reading group within their reading classes. As in the Hillson et 
al. (1964) study, the results strongly supported the nongraded 
approach. By the end of three years, students in the nongraded pro- 
gram were achieving approximately .7 grade equivalents ahead of 
similar student* in earlier years before nongrading (ES ■ +.55) on 
standardized reading, spelling, and language tests. 

f> • ■ - 

Halliwell (1963) evaluated a nongraded primary program that was 
virtually identical to the Joplin Plan. Students in grades 1-3 were 
regrouped for reading only, and remained in heterogeneous classes 
the rest of the day. Spelling was also included in the regrouped 
classes for second and third graders. The article is unclear as to 
whether within-class grouping was used in regrouped reading classes, 
but there is some indication that reading groups were not used. 
Results indicated considerably higher reading achievement in non- 
graded classes than in the same school the year before nongrading 
was introduced (ES « +.59) • Scores were higher for nongraded stu- 
dents at every grade level, but by far the largest differences were 
for first graders, who exceeded earlier first grade classes by .94 
grade equivalents (ES » +1.22). 



It is important to note that mathematics achievement, measured ft 

•the^f^ 

in the nongraded classes, than in previous yea^ Sin^e 
mathematics was not part pf the nongraded program, this f inding sug- 
gests the possibil ity that factor s other -than the nongraded program 
might account for the increases in student achievement. However, 
the author notes that teachers claimed to have been able to devote 
more time to mathematics because the nongraded program required less 
time for reading, spelling, and language instruction than they had 
spent on these subjects in previous years. 

A study by Skapski (1960) also evaluated the use of nongraded 
organization for reading only v The details of the nongraded program 
were not clearly described, but it appears that reading groups were 
not used within regrouped classes and that curricula and teaching 
methods were traditional. Two comparisons were made. First, the 
reading scores of students in the nongraded program were compared to 
the same students 1 arithmetic scores, on the assumption that since 
arithmetic was not involved in tne nongraded plan any differences 
would reflect an effect of nongrading. Results of this comparison 
indicated that second an$ third grade-aged students achieved an 
average of 1.1 grade equivalents higher in reading than in arith- 
metic. 

Further, scores of third graders who had spent three years in the 
nongraded program were compared to those of students in two control 
schools matched on IQ. Results indicated that the nongraded stu- 



dents achieved a much higher level in reading than did control 



■■Bp"* 

f£j^t*MJi§ #**.57). f> . bi$.. ; 4$«rt were no ^fferences in arithmetic. 
Di|ferencee were particularly large for students with XQ's of 125 or 
higher <ES « +.97), but were still quite substantial for students 
with lQ»s in the range 88-112 (BS » +.52), v.- 

Only one study evaluated the use of a nongraded prpgram in mathe- 
matics. This study (Hart, 1962) took place in the same school which 
evaluated the Joplin Plan in reading in its intermediate grades 
(Hart, 1959). Experimental students were regrouped £or arithmetic 
instruction across grade lines, and were taught as a whole class. 
Students were frequently assessed on arithmetic skills and reas- 
signed to different classes if their performance indicated that a 
different level of instruction was needed. Experimental students 
who had spent three years in the nongraded arithmetic program were 
matched on IQ, age, and socioeconomic status with students in simi- 
lar schools using traditional methods. It is not stated whether 
control classes used within-class ability grouping for arithmetic 
instruction. Results indicated an advantage of about one- half grade 
equivalent for the experimental group (ES * +.46). 

ptimmary ptseussiont ds&lin JJlaa. Considered together, the 
results of the Joplin and Jopl in-like nongraded plans are remarkably 
strong. Both randomized studies found positive effects on student 
achievement, as did all but two of the ten matched equivalent stu- 
dies. 
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Only four studies presented results according to student ability 
levels. Morgan and Stucker (1960) found stronger positive effects 
of the Joplin plan for low than for high achievers, while Moorhouse 
(1964) found the largest gains for iugh and average achievers, and 
Kierstead (1963) found no experimental-control differences at any 
level of ability. Skapski's (1960) results indicated that very able 
students benefited the most from a Joplin-like nongraded reading 
program. In no case did one subgroup gain at the expense of 
another; either all ability levels gained more than their control 
counterparts or (in the case of the Kierstead study) none did. 

Throughout its history, the concept of "nongradedness" has been 
presented as an ideal to which schools may aspire rather than as a 
as a specific program which they may implement. Many of the studies 
of nongraded plans, especially of the Joplin-like variations, apolo- 
gize for their failure to fully live up to the "nongraded" ideal. 
As noted earlier, implementations of programs described as nongraded 
have ranged from simple regrouping plans for reading to very complex 
interventions. For example, Carbone (1961, p. 88) poses the follow- 
ing "six questions to be considered in discussing the concept of 
nonqradings 

1. Do we have clear statements of our instructional objectives 
organized in a realistic sequence and covering the entire 
span of our program? (Objectives) 



2. Do we have a sufficient variety of instructional materials? 
on different ley els of sophistication so that each teacher 
can adjust instruction to the range of abilities found in 
each classroom? (Instructional materials) 

3. Are we able to move toward greater individualization of 
instruction so that pupils can actually progress at indi- 
vidual rates (Individualized instruction) 

4. Are we willing to use grouping practices that are flexible 
enough to allow easy movement from group to group within a 
class and from class to class within a school? (Grouping 
practices) 

5. Do we have evaluation devices, based on our instructional 
objectives, that will provide clear evidence of pupil 
attainments and thus facilitate our decisions on grouping 
and progress? (Evaluation devices) 

6. Are we sufficiently committed to that educational shibbo- 
leth — recognizing individual differences — to do some- 
thing about the the differences that we have so long only 
"recognized"? (Human factors)" 

Examples of recommended practices grouped under these six ques- 
tions include use of self-teaching and self-testing materials, inde- 
pendent study on projects appropriate to students' interest, abili- 
ties, and needs, use of independent study or instruction to very 
small groups (2-6 students) at least two- thirds of the day, and 
variable amounts of time in which students graduate. In this ideal 
form, the nongraded elementary school is closer in conception to 
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individualized instruction or to the open school than it is to the 
Joplin Plan,, which does not use individualized instruction and 
reduces or eliminates withi^c^ 

Unfortunately, studies of nongraded programs often do not specify 
exactly what was ^ im^Mentedr ^The jsin dies xey iewed in this section 
evaluated nongraded plans which clearly involved several subjects 
and a comprehensive approach to nongrading as well as a few in which 
the nongraded plan was only briefly described. 

Overall, the studies of comprehensive nongraded plans are less 
consistent in finding benefits of these programs than are studies of 
Joplin-like nongraded plans, but the median effect size is still 
moderately positive (ES « +.29). However, there; is a tendency for 
the higher-quality studies to produce larger effect sizes than the 
lowe*-quality ones. For example, a large matched-equivalent study 
by Hickey (1963) found that students in nongraded primaries in seven 
Catholic schools learned significantly more after three years than 
did students in similar graded schools (ES ■ +.46). Similar results 
were obtained in matched equivalent studies by Buffie (1962; ES ■ 
+.35), Remade (1971; ES - +.31 grade equivalents), and Machiele 
(1965; ES - +.50). However, in none of these studies were the non- 
graded programs clearly described. 

Brody (1970) evaluated a nongraded program in which first and 
second graders had to pass a series of sequential steps in several 
subjects at 90% mastery, and were placed in groups according to 
their mastery of specific skills (regardless of grade level). Ver- 
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tieal advancement of student© was strongly emphasized. At the time 
of esse Asmenti. first graders had been in this pro^jram one y^ar and 
second graders two y ear s, but both groups of students gained signi- 
ficantly more than did students matched on IQ in graded classes (ES 
* +.28) • Effects were particularly large in mathematics (ES »•• ••••• 

+•52)* This study was somewhat flawed by the fact that before 
matching, the nongraded students were 5.4 points higher in IQ than 
their graded counterparts. 

The only matched equivalent study to find no differences in 
achievement between nongraded and evaded programs was one by Otto 
(1969) , which took place in a laboratory school at the University of 
Texas, Unlike most of the studies of comprehensive nongraded plans, 
the Otto study fully described the nongraded intervention, which was 
designed to be a full-scale implementation of the Goodlad and Ander- 

i 

son (1963) nongraded plan. 

Unfortunately, experimental and control groups did not differ on 
many elements held to be essential to the nongraded program. Teach- 
ers of the nongraded classes did assign students to instructional 
groups across grade lines and did have students use more individual- 
ized materials and provided less whole-class instruction than did 
teachers in the graded program. However, the nongraded classes did 
not use more subgroups than graded classes and did not reduce the 
heterogeneity of subgroups. Because the experiment took place in a 
laboratory school, it may be that control classes were of high qual- 
ity and control teachers may have used many aspects of nongrading in 
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their classes. It is inter eating to note that .anpth^r study )in « ; ; 
university laboratory school by jRpss (1961), - also ^ 
f erencee 'to^en^n^ 

Perhaps the most comprehensive n^ 
side of a university laboratory school was- one s^ Bovfngn 
(1971) , in which indiyidualised instruction, teafa teaching, ; *l|xib>l e. 
grouping, and learning centers were used. This one-year study found 
strong positive effects on the achievement of intercalate students 
(ES ■ +.52) but not of primary students (ES - .06) • Kil lough (1972) 
also found significantly positive effects of a comprehensive non- 
graded program implemented in an open?- space school, although the 
details of the intervention are not described. 

The only study to find higher achievement in graded than ungraded 
schools was also perhaps the lowest in methodological quality. This 
is a study by Carbone (1961) which compared the achievement of stu- 
dents in traditional graded schools to those in schools mentioned by 
Goodlad and Anderson (1959) as nongraded, controlling for IQ scores. 
The students involved were in grades four, five, and six, which is 
to say one, two, or three years (respectively) after their experi- 
ence in the nongraded primary. Further, there were substantial IQ 
differences between the two sets of students, and teacher questionn- 
aires indicated very few differences between the two sets of teach- 
ers in reported classroom practices. 

Another study, by Hopkins, Oldridge, and Williamson (1965), found 
no achievement differences between graded and nongraded classes 
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^ontro^ing 

?tljpnf ' febr u* ry 12 # ' 1986 ) indicated that the ^ 
-trying to iinplenient; nongradi^g as sug 



: It is interesting to note that, probably because it appeared 
; ;V .early in the nongrading movement , the Car bone (1961) study was taken 
^ by several rev ieweres of this literature as a serious counterweight 
S to the positive findings of other studies. For example, this was 
|;' the only negative evidence cited in a review done by the National 
^ education Association (1967) , yet the review concluded that "no con- 
clusive data favoring nongraded organization over the graded or 
graded over the nongraded can be found in studies made so far, but 
the preponderance of studies appears to be favorable (page 160) . 



Smstixx And DipcueBlon: ComprghfengAVi Nongradefl JOang. overall, 
|"v the data from studies of comprehensive nongraded plans supports the 
§ use of this grouping plan (also see Pavan, 1973). Excluding studies 
|v done in university laboratory schools and the seriously flawed Car- 
bone (1961) study, the median effect size rises to about +.33. Two 
studies (Hickey, 1963; Buffie, 1962) found that the effects of non- 
graded programs were particularly positive for high achievers, and 
Bowman (1971) found that older students benefitted more Mian younger 
ones. It may be that students need a certain level of maturity or 
self-organizational skills to benefit from a continuous-progress 
program which includes a good deal of independent work. 
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w4fehin-cia fla Ability ^rojminfl 

between-claB$ ability grouping in many inpor tahjt ways. Pi rst , this ^ 
research is considerably more likely to use random assignment than 
is research on be tween-ci assV abil ity ~gr ouping . " 7 PiVe randoini« ed at u- 
dies met the criteria for inclusion *ppl led in: ^thi*_.;'fcertf t|BjW.'_' : " < *nft""-:J." ; 
additional study, by Putbrese (1972) , also uiaed random assignment • 
but was omitted because it had only one experimental and one control 
class). Second, the duration of within-class ability grouping stu- 
dies is shorter, with most studies lasting about one semester. 



Table 5 here 



Research on within-class ability grouping is summarized in Table 
5. Every study which met the inclusion criteria involved the use of 
math groups, although Jones (1948) also studied grouping in reading 
and spelling. The lack of studies of grouping in reading is sur- 
prising. It may be that this practice is so widespread that forma- 
tion of ungrouped control groups is difficult to arrange, even on an 
experimental basis* 

Every study of within-class ability grouping in mathematics 
favored the practice, though not always significantly. The median 
effect size for the five randomized studies is +.32j including 
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•^••matched studies makes the median only alightly higher. Five studi 

^vifriio.^!^.ea.r pajttern-of.^ej^^ ; • ^'-y^;g|-| 

^Inrcl aes ability grouping Sfean ^n con>r ol ^( ungrpuped ^^t^eatme^ts^^^ :^ 
&~y- However , it is interesting to note that the ^diah off ect eiase f or 
low achievers (ES +.65) was higher than that for average (ES ■•■• 
+•27) or high achievers (BS ■ +.41) , 



Slavin and Karweit (1985) conducted two large randomised studies 
of within-class abil ity grouping, one in highly heterogeneous # 
racially mixed schools in Wilmington, DE .(Experiment l) , and one r in 
relatively homogeneous, predominately white schools in and around 
Hagerstown, MD (Experiment 2). 
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In Experiment 1, grade 4-6 classes were randomly assigned to one 
of three treatments. One, an i ndi v i dual iz ed mode 1 » i£ hot consid- 
ered here. A second was a whole-class instructional model called 
the Missouri Mathematics Program (HHP) , which had been found in ear- 
lier research (Good & Grouws, 1979) to be more effective than tradi- 
tional whole-class instruction. The MMP, based on the findings of 
studies of the practices of outstanding elementary mathematics 

s 

teachers (e.g., Good & Grouws, 1977), uses a regular sequence of 
teaching, controlled practice, independent seatwork, and homework, 
with an emphasis on a high ratio of active teaching to seatwork, 
teaching mathematics in the context of meaning, and management stra- 
tegies intended to increase student time oi -task (Good, Grouws, & 
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Dewar (1964) randomly assigned sixth grade mathematics classes 
and their teachers in a suburb of Kansas city, KS to use within- 
class ability grouping or whole-class instruction for a full school 
year. Three math groups were used in the grouped classes. Results 
}£ strongly favored the grouped classes for students who had been in 

C the top, middle, and low groups in comparison to their counterparts 

t in the control group (ES « +.55). in a similar study, Smith (1960) 

■ • y 

S randomly assigned grade 2-5 classes to grouped or control conditions 
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I s "-SBbmeier. 1983) . The third treatment -| 

ability :^ri&upiiM ^ 

this trea^eht 



Results of • the semester* long study indicated; sigh 
achievement in ability grouped than in whole-class instruction (ES » 
+.32). Effects were large for mathematics .cWMtattioni/X#> :*.<M) J^-? 
but there were no differences in concepts and applications (ES ■ 
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Experiment 2 involved the sa|e. comparisons, except that an 
untreated control group was also added. The results very closely 
Iifia|^la79^:-;U|(^''' of Experiment 1| students in the abil ity grouped 

«w>re than those in the two whoie^class 
methods (ES - +.27), with differences larger in mathematics computa? 
tions (ES ■ +.37) than concepts and applications (ES ■ +.17). 1| 



^pr^f^/jpnthii^ within each gr ode level etudejits in;.;e^pe:rif ent^i 
^n^^nttol^ g r oups were mauhed on ee x, age , ach ieyeinent 4 ey el ^ imd 

especially for students assigned to the lowest group (LS - *.©■ 9) . 



v The smallest positive effect of any study of within-class ability 
j^oupihg ,wes reporteai in a study -1^;^ who ran- 

0:>- doroly assigned four sixth grade mathi&matics classes to abii ity 

grouped or whole-class treatments for one semester and then had all 
classes experience the opposite treatment for one semester. After 

S£y'-o\!ohe semester the /scores of t^ ^Wl ity grouped students were higher ; 

'i<- than those of the- control students (ES ■ +.30} r but second semester 

f - results nearly wiped out this difference (ES «•' +.07) . it is imjppr- 
taiit to note that this was the %ni y study to use four math groups 

C rather than two or three* 

W : ' .' ' : '.\ r " . V. ... "■ 

p , Three nonrandomized studies generally supported the results of 

| the randomised ones. Spence (1958) compared the achievement gains 

# of students in mathematics classes using with in-ci ass ability group* 

| ing to that of students in control classes matched on XQ and arith- 

I metic achievement. Results indicated significantly greater gains 

-.5* 

for the grouped students. Stern (1972) compared the achievement of 
| low achievers in classes using math groups to that of students 

r matched on achievement pretests. Despite matching, pretest differ- 

?. ■ • ■ 

ences favored students in the control conditions, but gain scores 
! clearly favored the grouped classes (ES ■ +.36) . Mote that these 
I low achievers were not in homogeneously low classes, but were 
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selected from among heterogeneous math classes using or not using 

^#|Hir^ci,ass ■ . >, 

A study by Jones (1948) also compared matched students in differ" 

study is not well specified* but did involve within-class grouping 
of some kind for reading, spelling, and mathematics* Results 
favored grouped classes for all three subjects and for three levels 
of ability (overall ES ■ +•26). information adequate for computa- 
tion of effect sizes was only provided for composite achievement. 

summary jud Discussion; Withln-ClaBB Ability grouping. Research 
on the use of math groups consistently supports this practice in the 
upper elementary grades. Among research on ability grouping in gen- 
eral! this research is of exceptional quality. Five well-controlled 
studies used random assignment of classes to treatments and sample 
sixes large enough to minimise the potential impact of teacher 
effects. 

There is no evidence to suggest that achievement gains due to 
within-class ability grouping in mathematics are achieved at the 
expense of low achievers) if anything, the evidence indicates the 
greatest gains for this subgroup. This finding is surprising in 
light of several studies of ability grouping in reading suggesting 
that students in low reading groups experience a lower quality of 
instruction than do those in higher groups (see, for example, Rist, 
1970; Allington, 1980; Eder, 1981). Time on task is generally lower 
in low than in high reading groups (Gambrell, Wilson, & Gantt, 1981; 



Opod 6 Beckerraan, 1978) Martin & Evertson, 1980), and there is some 
evidence that low reading groups, receive lower-level questions 
Ifei^e^i 1976) and more teacher in]te|rupjtfpnr i&aer ,. 1^ 
high groups. A few studies (e.g. , Wei nstein, 1976) find that read- 
ing group; memberships^ 
ability is statistically controlled. 

Yet comparisons of high and low reading groups are largely com- 
parisons of more and less able or proficient students, not compari- 
sons of different classroom organization methods. It is hardly sur- 
prising that high and low achievers differ and that their teachers 1 
behaviors differ accordingly. Comparisons of achievement gains in 
high and low reading groups are bound to show an advantage of being 
in the high group because high achievers learn more rapidly than low 
achievers, and unless measures used to control for initial ability 
are perfectly reliable and perfectly predictive of later reading 
achievement, assignment to the high reading group will appear to 
lead to higher achievement (see Rei char dt, 1979). 

However, comparison oi relative achievement gains or other dif- 
ferences between high and low reading groups only indirectly address 
the critical questions what are the most effective instructional 
arrangements for low achievers (as well as average and high achiev- 
ers)? In elementary mathematics, the evidence presented here sup- 
ports the use of within-class ability grouping for all students, 
especially low achievers. It cannot be assumed that results in 
mathematics can be applied to reading, but it is certainly the case 



-58- 



that only e*perime^ °* grouped and ungrouped reading 

g&i^^^ done to atu^ wath growing can determine ttoe 
achiewe^nt ^ecte oi wtthl^ 1 

The previous discission is based on the assumption that any 
ef feet ^f w^ large p^rt due to the 

reductions in heterogeneity it brings abput. However, there is one 
study which raises some ;guesti on about Jthis assumption. This study, 
a dissertation by Eddleman (1971), compared within-class ability 
grouping in mathematics to a within-class grouping plan in which 
students were assigned, to three Jiftteroggneowg subgroups* There was 
come differentiation of instructional level for students in the 
ability grouped classes, but in all other respects the teacher's 
methods in the two groups were identical, wi|h instruction given to 
one group at a time while the other two groups worked problems at 
their desks. Classes were randomly assigned to treatments, with the 
same teachers teaching classes using homogeneous and heterogeneous 
subgroups. 

Results of the nine-week study slightly favored the heterogeneous 
grouping plan (ES - -.16) . Unfortunately, there was no grouped 
control condition, so it is, impossible to determine whether the two 
forms of subgrouping were equally effective ok equally ineffective? 
the brevity of the study suggests the latter. However,. if future 
research were to establish that within-class ability grouping and 
within-class heterogeneous grouping were equally effective (and more 
effective than ungrouped arrangements) , we would have to reconceptu- 
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F alize the usual explanations for the effectiveness of within-o.ass 
% ability grouping. For example, it_ My be that within-class abil ity 
5 -grouping increases achievement by reducing the size of instruction 

•groups (say i from thirty to ten) or by structuring the teacher's 
5 instructional time more effectively rather than having anything to 
I do with reducing homogeneity (see Slavin and Karweit, 1984)* 
t Clearly* research directed at explaining the achievement effects of 
within-class ability grouping is needed* 



PiSCUBSigil 

% Many previous reviewers of the ability grouping literature have 
f characterized the evidence as a muddle or a maze (e.g., Borg, 1965; 
J?) .. Passow, 1962). However, earlier reviewers have generally combined 
§' ■• elementary with secondary research, good quality with hopelessly 
-f - ' biased studies, research on comprehensive ability grouping plans 
"% with that on special programs for the gifted or learning disabled, 
£ and in some cases, research on between-class ability grouping with 
| that on within-class grouping. 

f When the scope of the review is limited to methodologically ade- 
%. quate studies of comprehensive ability grouping at the elementary 

level and different types of ability grouping are reviewed sepa- 
I rately, the results are surprisingly clear cut for most types of 

grouping. The best evidence from randomized and matched equivalent 

'!•■ 

studies unequivocally supports the positive achievement effects of 



the use of within-class ability grouping in mathematics ana of Jop- 
1 in and nongraded plans in reading. %wntr^ 
port for the practice of assigning students to self-contained 
classes according to general ability or performance level, and there 
are enough good quality studies of this practice that if there v*re * 
any effect, it would surely have been detected. Evidence on the 
effects of regrouping within grade levels for reading and mathemat- 
ics is unclear, and there iB no methodologically adequate evidence 
concerning the use of reading groups. 

The conclusion of the research reviewed here for practice may be 
quite simple: Use the grouping methods which hay e been found to be 
efffctive (within-class ability grouping in mathematics, Joplin and 
nonvtaded Plans in reading), and avoid, those which have not been 
found to be effective. In particular, there is good reason to avoid 
ability grouped class assignment, which seems to have the greatest 
potential for negative social effects since it entirely separates 
students into different streams (see Rosenbaum, 1980). However, 
there is much more we must understand abOu„ how various ability 
groupina plans have their effects. . A theory able to encompass the 
research findings is needed. The remainder of this paper explores 
the findings and other evidence in an attempt to extract general 
principles of grouping for instruction in the elementary school. 
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The primary reason educators group etude nts according to abil ity 
or performance level is to enable teachers to provide i us "ction 
closely suited^© :the^ reading 

a highly diverse class, it is argued, one level and pace pi instruc- 
tion is likely to be too easy for some students and/or too difficult 
for others. Ability grouping is supposed to reduce student hetero- 
geneity so that an appropriate pace and level of instruction is pro- 
vided for most students* 

Having instruction be carefully accommodated to students* level , 
of readiness is probably more Important in some subjects than in 
others. In general, subjects in which skills build upon one another 
in a hierarchical fashion (e.g., mathematics, reading) should 
require more accommodation to individual differences in learning 
rate than subjects in which learning the next skill or concept is 
less' clearly dependent on mastery of earlier material (e.g., social 
studies, science). The reason for this is that with hierarchically 
organized subjects, there is a risk that if the teacher proceeds too 
rapidly, some students will lack the prerequisite skills needed to 
learn new material, while if the teacher takes the time needed to 
ensure that all students have prerequisite skills, the more able 
students will waste a great deal of time. 

f 

Ability grouping is one logical way out of the dilemma posed by 
having to choose one instructional pace for a diverse group in a 
hierarchical subject. Yet if an ability grouping plan is to have 
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the dt sired effect, there ate at least three criteria it must 

1. The grouping plan roust measurably reduce student heterogeneity 
. ;.. in tfrf «P*" if te gklll beino taiidhtt . : ,J, : 

2. The ' ^an ^st.; be fi exiWLi enwigh to al low .. teachers to resppnd to 
missassignments and changes in student performance level after 
Initial placement i and 

3. Teachers must actually vary their pace and level of instruction 
to correspond to students' levels of readiness and learning 
rates. 

As noted earlier, research on the effect of grouping on class 
heterogeneity has found that in the situation typical of elementary 
schools where students are divided into two or three "homogeneous" 
groups, the actual reduction in heterogeneity brought about may be 
quite minimal. This is particularly true when students are assigned 
to classes on the basis of IQ or of a general measure of perfor- 
mance, as imperfect correlations between these measures and actual 
performance in any particular subject leave a great deal of hetero- 
geneity in the supposedly homogeneous classes (Goodlad a Anderson, 
1963} Balow, 1962; Clarke, 1958} Balow & Curtin, 1966). 

Thus, ability grouped class assignment generally fails to meet 
the first of the three criteria listed above; a otve-time assignment 
by general ability is unlikely to create enough home Uy on any 
particular skill to make an Instructional differenc 
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Ability grouped class assignment is also unlikely to fulfill Mie 
second criterion, flexibility. Transferring students between self- 
contained classes is difficult to arrange, so students who are 
misassigned or whose achievement level markedly changes over time 
"are likely to remain in the sel f-contalned class, ; in contrast, 
regrouping and Joplin and nohgraded plans group students based on 
their performance in * specific skill, and are inherently more flex- 
ible than ability grouped class assignment, as changing students 
between regrouped classes only involves one subject, not a change in 
students' main class identification. Similaily, within-class abil- 
ity grouping is done based on performance in a particular skill, and 
is the easiest grouping plan to alter based on changes in student 
performance. 

To what extent do teachers adapt their level and pace of instruc- 
tion to the needs of different ability groups? Research comparing 
alternative grouping arrangements has not examined this question in 
any depth, but there are some clues. Studies by Barr an< Dreeben 
(1983) found that teachers do adapt their instructional pace to 
accommodate the aptitudes of reading groups, but they also found 
considerable variation from school to school and teacher to teacher 
in pacing for groups of similar aptitudes. 

Some indirect evidence suggests the importance of adapting 
instruction to student differences. One form of grouping often seen 
in mathematics instruction involves assigning students to three 
ability groups within the class. The teacher presents one lesson to 



. T .._^_. -^-.-y - ■-f--^ -. --y-\, . ■ ; -v v 

the class as a whole, and while students are doing sea twork, visits 

with foe^^ 'i'?: 
high group to provide enrichment, and the middle group to provide 
some of each* Note that this strategy does not adapt the pace or 
level of instruction to student nee^ 

level ( Bier den, 1969? Mortl ock, 1970) and one in a community college 
(Merritt, 1973) found no significant differences between this type 
of within-class ability grouping and traditional whole-class 
instruction, much in contrast to the studies of within-class ability 
grouping plans in which level and pace of instruction were adapted 
to student performance levels. 

One critical feature of the successful Joplin/nongraded plans is 
frequent, careful assessment of student performance levels and pro- 
vision of mate rial b appropriate to these levels regardless of stu- 
dents' grade levels* In these plans, adaptation of instructional 
pace and level to student needs is as great as it could possibly be 
short of individualization; in one study of the Joplin plan (Moor- 
house, 1964) it was noted jthat "three-quarters of the (grade 4-6 
experimental) students were reading material either above or below 
the grade level they would usually be asked to attempt in the graded 
system" (pp. 281-282) . 

In contrast, a study of regrouping for reading by Moses (1966) 
instructed experimental teacher^ to use only materials appropriate 
to students' grade levels and to follow the school district's usual 
course of study. This study found no significant advantages of 
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ability grouping (ES « +.05) . Similarly, Davis and Tracy (1963) / in 

|a-s^u^4df-r^ 

;;teachers. to the same grade-level textbooks. Xn this study i control 
ftstujents gained more in achievement than did students in the within- 
-grade regrouping plan* 

An otherwise similar study of within-grade regrpuping f 
matics by Provus (I960) did allow for differentiation of level and 
pace of instruction! noting that M ... it was possible fpr a fourth 
grader. •• to advance to sixth grade or even eighth grade work by the 
end of the school year* " Control (ungtouped) students were also 
able to go beyond their designated grade level/ but presumably did 
not do so as often as did experimental students. This study found 
strong positive effects of regrouping on mathematics achievement (ES 
« +.39) . 

Of course, Joplin and nongraded plans can be seen as forms of 
regrouping for reading and/or mathematics which go to great lengths 
to adapt the level and pace of instruction to that of the regrouped 
classes. In fact* it could be argued that it is not the cross-grade 
aspect of Joplin/nongraded plans that accounts for their effects, 
but rather the fact that students in these plans are carefully 
assessed and given instruction appropriate to their needs in the 
regrouped classes. 

Taken together, the evidence points to a conclusion that for 
ability grouping to be effective at the elementary level, it must 
create true homogeneity on the specific skill being taught and 
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instruction roust be closely tailored to students 1 level s of perform 



Imkk Tdftht t f loat ion 

Critics of ability grouping <e. g. f Oakes, 1985; Schaf er & Olexa, 
1971; Rose nba urn, 1980) have often noted the detrimental psychologi- 
cal effect of being placed in a low achieving class or track. An 
interview with a former delinquent about his discovery that he had 
been assigned to the "basic track" in junior high school illustrates 
this theme (from Schaf er & Olexa, 191(1, pp. $2-63)j 

" ..• I felt good i#henKa8>with my (elementary) class, -but 
when they went and sepa rated us that changed us . That 
changed our- ideas, lour thinking, the way we thought about 
each other, and turned us to enemies toward each other — 
because /they said I was durofc and they ^«r e smart . 

When you first go to junior high school you do feel some- 
thing inside— it's like a ego. You have been from elemen- 
tary to junior higb, you feel great inside. you get this 
shirt that says Brown Junior High. . and you are proud of that 
shirt. But then you go up ther% and the teacher says — 
"Well, so and so, you're In the^basic sectto^ you can't go 
with the other kids. " The devil with the whole thing — you 
lose — something in you — like it goes out of you. 



The anguish expressed by the student who was assigned to the 
basic track is interesting in light of the high probability that the 
student had been in low reading groups in elementary school, but he 
still perceived being ■separated" into different classes as a com- 

i 

pletely new and much more serious affront to his self-esteem. With- 
in-class grouping generally takes place within the context of a more 
or less heterogeneous class, and a student still identifies with the 
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^ *rint iMi^lv^; it^nfc 
tations thajn ^e^^ 

performing classes would seem likely to come to see themselves as a 
differ eat species of human being* where those assigned to low read- 
ing or math groups in, heterogeneous clils^es may see this placement 
as being done to help them, particularly if assignment to 9 r *?«ps is 
flexible and is clearly focused on achievement in a Articular sub- 
ject. . 

Teachers 1 expectations and behaviors may also be different in 
different types of abil ity grouping. Not surprisingly, teachers 
prefer to teach higher-achieving students (NBA, 196 6) and have 
higher expectations for their achievement. These expectations can 
have an impact on teachers 1 behaviors and students 1 achievement (see 
Good and Brophy, 1984) . For example* in a study of Air Force train- 
ing, Schrank (1969) had students randomly assigned to classes, but 
told instructors that the classes were grouped by ability. Classes 
which had been (falsely) identified as high achieving in fact 
achieved more than did classes identified as low achieving. 

The problems of teachers 1 low expectations for students in low- 
track classes and their dislike for being assigned to these low 
achieving classes are largely alleviated in Joplin and nongraded 
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pliMf in which reading classes are f ormed across age llines. This -7 

Sli§jw;^&^^ 'M:9^rl^4$y : in.g ;;xPung^M4»»t.i-.. «Mfi . :• 

^p^^^i^ihf^der students, so: homogene 1 ty f or Inst r ucti on may be 
achieved without establishing classes that teachers do not want to 
• ;-iiii^iii^Si<Pi. teachers expect ^iittie # Also, as noted by one ••' 
' ' '(of. *^ n *1fei^i. . 02 the kill spji et ii. (1964) study of a hongraded 
plan (J* W« Moore personal comrouni cation, January 23, 1986) , low 
achieving students in nongraded plans progress from reading level to 
reading , level rather than remaining year after year in the low ready- 
ing group* 

Teachers may have low expectations fori students in low reading or 
math groupsr but there is some evidence that they jtry to bring low 
groups up to the level of the rest of the class* For example. Rowan 
ft Miracle (1983) found that in between-class ability grouping teach- 
ers tended to maintain a slow pace of instruction for low achieving 
classes, but tended to allocate more time and a more rapid pace of 
instruction for low reading groups in heterogeneous classes. This 
and other research (e.g., Alpfrt, 1974) suggests that in within- 
class ability grouping, teachers tend to try to equalize the 
achievement of all students by assigning smaller numbers of students 
to low groups. 

Another issue relating to students 1 identification with their 
class is the question of how many times students are regrouped each 
day. When students are regrouped for reading and/or mathematics, 
they still typically spend the rest of their school day in heteroge- 
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neous homeroom classes, which ^obably rem air as their primary 
?rji^rj|iie^ SfojfOjj^ 
the situation comes to resemble de^rtmentan 
move from teacher to teacher end have no one group with which to 
identi f y • unfortunately f there is no research on 
comes of departmentalization at the elementary level r although one 
study of seventh and eighth qriders by Spivak (19)56) found that stu 
dents in self-contained classes learned more than matched students 
in departmentalized settings. 

Departmentalization might reduce students 1 attachment to school 
by diffusing their attachments to particular teachers* indirect 
evidence of this is a finding by Slavin and Karweit (1982) that stu 
dent truancy in an urban school district rose from about $% in the 
fifth and sixth grades to 26% in the seventh grade, the time of 
first exposure to a departmentalized (and tracked) school in which 
no one teache* takes responsibility for any one student. It may be 
that the Koont» (1961) study, in which students were separately 
regrouped for reading, mathematics, and spelling, deprived students 
of an opportunity to identify with a single teacher and a heteroge- 
neous class, in this study, heterogeneous control students gained 
more in achievement than did those who were regrouped, with low 
achievers suffering most from regrouping. 

The evidence on the importance of having students principally 
identify with a heterogeneous class is more speculative than conclu 
sive, but several indirect indications support the following conclu 

v 
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sionj Af^ai^ JM^-'i 
h<proom wifchf *hic^ I 

^aiS^of^ui 

plan must have clear eauca tipnal berief its it it is to be ^ustif ;f:^U 
Because no achievement benefits of ability grouped class assignment 
have been identified, and because more effective grouping methods 
exist, use of this strategy should be avoided, 

instructional Sim 



One issue of considerable importance in relation to within^cl ass 
ability grouping relates to a tradeoff between providing students 
with instruction appropriate to their needs on one hand, and provid- 
ing adequate instructional time on the other (see siavin, 1984a) . 
When a teacher uses a within-class ability grouping plan with three 
groups, this means that students must spend at least two- thirds of 
their instructional time working without direct teacher instruction 
or supervision. Several studies have found that large amounts of 
unsupervised seatwork are detrimental to student achievement (see 
Brophy and Good, 1986). Transition times between ability groups 
further reduce instructional time (Arlin, 1979). 

The amount of instructional time lost due to use of within-class 
ability grouping depends directly on the number of groups in use. 
Division of students into large numbers of ability groups forces the 
teacher to spend less time with each group and to assign large 
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The evidence summarized in Table 5 clearly indicates that regard- 
less of any losses in instructional time associated with within- 
class grouping, this strategy is instruct ionally effective in ele- ■ 
mentary mathemati cs. Mathematics instruction does require a certain 
amount; of time for students to work problems on their own, so folio- 
wup time (the time during which some students must work by them- 
selves while others are working with the teacher) may be less of a 
problem in mathematics than in other subjects. However, it still 
seems apparent that the requirement for large amounts of follow up 
time is a drawback in any within-class grouping arrangement. 

The problem of followup time may be important in explaining the 
effectiveness of Joplin plans for reading. The studies, of this pro- 
gram do not typically compare the numbers of ability groups used in 
experimental and control groups, but it is clear that there are 
smaller numbers of reading groups in Joplin than in traditional 
classes, and that in some cases Joplin Plan classes do .not use read- 
ing groups at all. In fact, some authors (e*g., Newport, 1967) 
clearly describe studies of the Joplin Plan as comparisons of inter- 



im 
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\iy||^no^|i*^ w^ich may^^^^l^ ■' : 

Jllte*c£^ 

h j^ect eise favoring within-class abii in ro^hematiLcs 

. +.07) # is also the only one to use four (rather than two or 
three) -ability groups. 
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vs. intra-class ability grouping, sinc^ control groups in 
;~SfcUdi^,^^ 

create such homogeneous classes that the need 

grouping is diminished or el iminated. The time savings of such 
• plans *re. • tfeeieiior^ 

in part because students in them receive a greater amount of direct 
instruction from the teacher and superv is ion during eeatwork than dp 
students in control classes using the more typical three or more 
reading groups. 

This line of reasoning may justify use of smaller numbers of 
ability groups in heterogeneous reading, classes. Unfortunately 
there is no direct experimental evidence on the optimum number of 
reading or math groups; the number three is treated as though it 
were handed down from Mount Sinai. In heterogeneous classes it may 
be that small numbers of ability groups do not provide adequate hom- 
ogeneity for effective instruction. However , it should be noted 
that two studies involving only two ability groups in heterogeneous 
mathematics classes found significant benefits of ability grouping 
for student achievement (Slav inland Karweit, 1985). 
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^'5r'Thi,s " paper rev iewed the, best evidence concerning the 'achievement 
^effects of comprehensive ability grouping plans in elementary 
Saqhopl B+^J^&i&Xncl$*l gr oupi ng pi ana L«e r r e ^examine d s a bi 1 i ty , 
^grouped class assignment, regrouping for reading and/or mathematics, 
^ Jopl in and nongraded plans, comprehensive nongraded plans, and with- 
: in-class ability grouping. The effects of these grouping methods on 
student achievement from methodologically adequate studies are eum- 
% maris ed below t 



£ ftfr*i>fcy £££JJJ2£d JC1&BJ3 Mliaw&te* Evidence from fifteen compari- 
|r >eons in twelve matched equivalent and one randomized study clearly 

indicates that assigning students to self-contained classes accord- 
Si' ing to general achievement or ability does not enhance student 
W achievement in the elementary school (median BS » .00) . 



Regrouping Bfiflfliflg .Slid Mathematics* Research is unclear on 
f the achievement outcomes of grouping plans in which students remain 
f in heterogener asses most of the day and are regrouped by abil- 
$' ity within grade levels for reading and/or mathematics. There is 

■■■■x ■ 

)l some evidence that such plans can be instructionally effective if 

/*.. 

|. the level and pace of instruction is adapted to the achievement 
|. level of the regrouped class and if students are not regrouped for 
more than one or two different subjects. 

-, ft i 
v 

jpplin There is good evidence that regrouping students tot 

% reading across grade *ines increases reading achievement. The Jop- 
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lin Plan (Floyd, 1954) and essentially similar forms of nongraded 
plans u .t., nm2on ^^ 

positive effects on reading achievement (median ES - +.44) , and one 
study (Hart, 1962) found that a similar program could also be effec- 
tive in mathematics. 

rr rT ^h,nniv e ^onaiadsd lianfi. Evidence from studies of non- 
graded plans closer to those suggested by Goodlad and Anderson 
(19635 has been less consistent than for Joplin-like nongraded 
plans, but the preponderance of the evidence is still positive 
(median ES - +.29) • In particular, the best evidence from well-con- 
trolled studies in regular schools supports the use of comprehensive 
nongraded plans* 

w m <n-n* eB Ability JjtfMttlng* Research on within-class ability 
grouping is unfortunately limited to mathematics in upper elementary 
school. However, this research clearly supports the use of within- 
class grouping (approximate median ES - +.34) , especially if the 
number of groups is kept small. Achievement effects of within-class 
ability grouping arc slightly larger for low than for high or aver- 
age achievers. 

in Edition to conclusions about the effects of particular group- 
ing strategies, several general principles of ability grouping were 
proposed on the basis of the experimental evidence. The following 
are advanced as elements of effective ability grouping plans: 
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times, and ;ie regrouped by ability only" in subjects (e.g. , 
reading, mathematics) in which reducing heterogeneity is 
.particularly important. Student s» primary identification 
should be with a heterogeneous class. 

2. Grouping plans must reduce student heterogeneity in the 
specific skill being taught (e.g., reading, mathematics). 

3. Grouping plans must frequently reassess student placements 
and must be flexiible enough to allow for easy reassign- 
ment s after initial placement. 

4. Teachers must actually vary their level and pace of 
instruction to correspond to students 1 levs Is of readiness 
and learning rates in regrouped classes. 

5. In within-class ability grouping, numbers of groups should 
be kept small to allow for adequate direct instruction from 
\,he teacher for each group. 

■v . ' 

I3i&311£ Direction s 

One great danger in reviewing any voluminous literature is that 
the review will discourage further work in the area, as researchers 
question the value of one more study. We hope the present review 
will have the effect of stimulating rather than inhibiting addi- 
tional research on ability grouping in the elementary grades. 
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There are many fundamental gwe at ions yet to be explored. For 
example, ^heiresearoh 

restricted to mathematics. Experimental studies of the use of read 
lag groups are needed, as are studies of optimum numbers of within- 
class groups tor reading and math. 

t ' ■ : ' 

Many studies are needed to understand why and under what condi- 
tions various grouping plan© produce achievement effects. Simple- 
appearing changes in grouping are likely to have complex effects, 
any of which may contribute to ultimate effects on student achieve- 
ment, for example, different between-class grouping plans (e.g., 
ability grouped assignment, Joplin Plan) are likely to have 
different ^fec^^n;^i^in-cXaBs grouping. 

Studies are needed to understand the effects of various grouping 
plans on what actually happens in the 'classroom, for example, how 
different plans affect the teacher's pace of instruction and use of 
class time and the success rate of students close to and far away 
from the class' mean aptitude. Component analyses are needed to 
explore the critical features of various grouping plans. For exam- 
pie, many fundamentally different practices go under the title "non 
grading. * which of these account for the positive effects seen in 
the studies of this practice? Is it simply use of flexible, homoge 
neous grouping across grade lines (as in the Joplin Plan), or are 
other factors involved? 

There are two particularly important reasons for further investi 
gation of grouping practices in elementary schools. First, every 

-77- 

85 



school district, school admini st r a tor , and teacher makes deoisipna : 
about abil i ty groupi at 'some "tiin% ana jU^ 
; m*de in light of reliable eyiae nee. ironically, the grouping prac- 

S^ice with the 

assignment, is among the most widely used; schools need effective 

t alternatives to this practice. 

Second, if educational researchers can identify grouping prac- 
tices which can accelerate student achievement, this would provide 
one Kind of school reform that would be low in cost, easy to imple- 
ment, and easy to maintain over time, in a time of increasing 
demands on education coupled with dwindling resources, research on 
easily modified school organizational practices seems particularly 
likely to bear fruit. We have much yet to learn in this area, but 
this review illustrates that the potential of effective grouping 
practices for meaningful improvements in the achievement of elemen- 
tary students is great, and is certainly worthy of further study. 
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Reading* Hath Graded Pi ans 
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Hath Groups 
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Article Grades 

Randomised Studies 

Cartwrighi 1-3 
& Mcintosh, 
1972 



Sampl e 

Size 



Honolulu, 262 
L<w Income (9 cl.) 



,\ I- 



Grouping 
Duration Criteria 



Desig n 



EFFE CT SI2ES 
Achievement Subject , 



2 yrs. 



IQ. 
Ach. 



Random aegmt. to 3 
trims Het, XG, 
"Flexible." XG 
classes grouped 
across grade 
lines 



XG v« Het : 
Rdg -.17 
Math -.52 

Flex va Kelt 
Rdg -.28 
Math .00 



Total 



-.34 



-.14 



: ; :f i 
.:.v; : <r • 

■ " »' 'i 



10 



i 

00 
I 



Matched Studies with Evidence of Initial Equality 
2-5 



Barker- 
Lunn, 
1970 

Goldberg, 
Pas sow. & 
Justman, 
1966 



Borg, 
1965 



5-6 



4-7. 
6 



Hartill, 5-6 
1936 



Barthelmess 4-5 

& Boyer, 

1932 



England, 
Wales 



5500 
(72 sch.) 



4 yre. IQ, 
Ach. 



New York 2219 
City, Mid- (86 cl,, 
die Class 45 sch.) 



2 yre, 



Utah 



New York 1374 
City (15 sch.) 



Philadelphia 1130 

(10 sch.) 



1 yr. 



IQ 



4-667 (22 cl.) 4 yre. Gen. 
6-875 (28 cl.) 1 yr. Ach. 



1 eem. Gen. 
(see Ach. 
design) 



IQ 



Matched schools 
in social class 



Compered class- 
es w/ specified 
IQ ranges. 
Students kept 
in same classes 
for 2 yrs. 

Compared 2 dist- 
ricts, Het vs AG 



Hi 0 
Av 0 
Lo 0 

V.Hi 0 

Hi (-) 

Hi Av (-) 

Av (-) 

Lo (-) 



Gr. 4t 
Hi (+) 
Av 0 
Lo (-) 



Gr. 
Hi 



6t 
+ 



Av (+) 
Lo 0 



Rdg/Eng 0 
Math 0 



Rdg (-) 
Math (-) 



Rdg 0 
Math 0 



Rdg (+) 
Math <+) 



(-) 



(+) 



Matched groups - Hi -».12 Rdg 4.05 .00 

each group Het 1 Av .00 Math +.01 

s em, AG 1 eem. Lo +.18 
Scores are gains 

Schools matched, Hi +.18 Composite +.21 

then studente Av +.22 Ach. 

matched. Scores Lo +.15 
are gains. 
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II.* 



M 

Ids.* 




■ji^^v; r -- * ■ ^ -"^ine^ 1936 ^^*HJ^^ 



My 
lift:* 

■- > • 

.f 



I 

I 



■ y 



10 



'IHJC 







•4™ijLO*'J» . t 1 '. :: 
ii.' v*.'-.. v* M. "c t O 

• ■; .. I'--' 


^*600 


>& Bergmat>» 


,. : • . • 




'• ■ ■) ■ 


Daniels*. 
1961 


2-5 


England 


521 
(4 ech.) 


Bremer. 
1956 


X 


Amarillo. 
TX (Anglo) , 


510 


Loooec 
1962 


A-6 


Iowa 
(Rural) 


490 
(23 cl.) 


Flair. 
1964 


1 


Skokie, IL 
(Suburban) 


441 
(17 cl.) 


Morgan- 
si ern» 


4-6 


Uniondale* NY 
(Suburban) 


119 
(7 cl.) 



3*5 yra. Gen. Schbpla naiched. 
Ach. ■. tWn Students 
«H^«d : ^n 

I yr. Rdg. v - Xonpartd A6« to 
Read- yrv/^bfifbra AG 
" i nee a introduced ■ ■ 



1 yr. IQ» Schools Batched 
Gen. Ach,, on gen. ach. 
Judgment 



1 yr. Kdg. Schools natch- 
tchr. ed on gen. ach. 
prognosis 

3 yre. - Schools Batched 
on IQ 



1963 



Hi -.24 
Av .00 
Lo **06 



Hi -.02 
Av +*04 
Lo -.06 



V.Hi ♦•54 
Hi -.21 
Av -.11 

Hi -.22 
Av +.15 
Lo +.64 



, Rdg/LA r.25 ^.26 
Math >.2? 



Rdg Ach -.10 



■ • * rim i 



Composite 
Ach. 



Rdg 4.03 
Math -.14 



-.04 



-.06 



Rdg/LA +.17 +.15 
Math +.06 



KEY: AG a Ability grouped class assignment 
Het - Heterogeneous class assignment 
XG a Ability grouping acvose grade lines 

♦ « Results clearly favor ability grouped classes 
(+) & Results generally favor ability grouped classes 

0 3 No trend in results 

(-) - Results generally favor heterogeneous classes 

• a Results clearly favor heterogeneous classes 



100 



• is*--"' 




© 
o 
I 



t 



to ■ 



» M" 



11 



. KoOnts, 
1961 



Norfolk, 

VA 



108 
(10 cl.) 



Rdg - -Rdg Ach 
Hath - Math Ach 
Lang Lang Ach 



Matched Studies Lacking Evidence of Initial Equality 



Berkun, 
Svanson, 
& Sawyer, 
1966 



3-5 



Monterrey, 
CA 



1098 
(10 ach., 
45 cl.) 



Davis & 

Tracy, 

1963 



4-6 



North - 
Carolina 



<)93 
(2 ach.) 



Balow & 

Ruddell, 

1963 



Southern 

Calif. 

(Suburban) 



197 
(8 cl.) 



1 yr 



Rdg Rdg Ach 1 yr 



Math Math Ach 1 yr 



Rdg - Rdg Ach 1 yr 
Math - Math Ach 



Studeri t s matched 
on gen ach. 
•Sco rear are.: _ 
equivalents! 



-V.Hi -.12 GE 

Hi -.25 -GE 4;Rdg 4 -.42' GE 

Lo -.41 GE Lang ;r ,12 GE 



Compared schools Hi +.44 Reading 
using ; AG or Het Lo +.32 
rdg classes. NO 
evidence of init- 
ial equality - 
poet t est s ad j . 
for pre. 

Compared schools Math 

using AG or Het 

math classes. 

Pretest diffs 

favored AG. Poet 

adj . for pre, 

IQ, other vara. 

Compared AG to het Hi 0 Rdg (•*■) 

schools. Pretest Av + Math (-0 

diffs favored AG. Lo + 
Scores ere gains. 



t-32"P' 



(-) 



(♦) 



111 



Ieric 



I,, 




... .,. :.•.- = ■ 1 

'31 



Morel and, WjcsC* 




' ~" diffg favored 
Scores are grade 

mM 



■'■^rfp 



w ■ 



I 

o 

M 

I 



KEX5 AG » Ability group used for eelectcd subjects 

♦ = Result ^cl^dtiy favor ability grouped claesee 
(+) *- Resul te geii^ral l^fiyjai^^ 
« No trend in iresultfi - - ... - 
a Result e generally favor heterogenoous claeeee 
» Resulta clearly favor heterogeneous claesee 
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lid 




u 3- t 



Artie la -r /Cradee ytociition - : ^«M ^Subject tipn 



^ Table ;3;:i 
rjcplin Plan 

■vHaaian^'.:: 



EFFECT SIZES 3 



to -to 
Achiavaatni Subjact ^ To£ji 



1 Randoaiaed Studiea 



O 

M 
I 



Morgan >..^\; ;V 5«6 ' Oiindea» 
Stuckar, "■HX.»: ; 



<8 Ci.) 



/■'.', .v^irandod^^aae 

JopUn or AO. Auth- ^ 
ora not* liaitad op- ^-".f 
^portuniiyj-for/-hirtch^^" Si'/^V 
^6th gradara to work^v--^ 
fa& 3^|rid^' f; 



♦.30 



Jonea, Moore, 
& Van Devander. 
1964, Jonea. 

:jr^(^ai;X«V^i?r-; 
pay ehder. 1^67 



Shaaokiii; : V 52 



V" 3 yra. ^ randomly aaaignad v;*^>. • • v - • o ; • ' ■ v 4 >;t^v 



(folicwup) to NG/Joplin or het.^V ^ 
" Ic'V'v ^cilaaaa^lor^ri 



Matchad Stud iec with Ev t danca o f In i t U 1 Equal i tv 
RuasalV, 



4^5 San Fran- : ■•: '526 
v, ciac^^^:^(6:*chJ 





- Green .& 
Riley. 
1963 


4-6 


Atlanta , 
(vhitaa) 


.'.4 «ch r . 




. .. '>.yr- ' 




Ingres, 
1960 


1-3 


Flint, 
Ml 


68 e\p 
377 cont 


Rdg 


3 yre 




Helli- 

well. 

1963 


1-3 


New 
Jersey 


295 


Rdg 


1 yr . 




Curson & 

Thompson, 

1964 


4-6 


Scbaste- 
pol. CA 


250 




1 yr 


•V^^-ti^y;-;^ .-/.'.air/. 










JIS 


TG0P1 



2 yra tMatchad^etudafita in\ ! 

- : ' : ^v >rt - ^ i Jppl:in, 7 i het.^ 

. on IqC^ Study pr^ 
-\ uce of law poplin 
Plani^-^TWoi reading 
■ / : iroupa uaaH withiir; v ^ 



i!t raidijng" only 'to 
pr«jyioiia>yr thftv) # 

Ccaipaced NG/J6plin 
in ir4g & laii^ ^ 
p riwio.uaviyr Jhat » ) . 
Es^^:ai|<^a - 



' Vftda '-a- ^* : .00 * A ^ ^ - : ^^r^^^Sf^f " 

" t - SfKft, *.■•*.. ^.\ >l . -V-,. ' . .-■ ./ ; ''\.Vf-"si. •"•'.4,- :■-■».•>: .^i^V^ Vr^ 



Mg 



Rdg 



+.36 



+.55 



in «ath» 
in program • 



waa not 



Cocaporod Joplin to 
AG claaa aaaignmant 
Mtching on IQ# Rdg. 
grpa^Wrt'ttied with- 
in JopUn claaaaa* 



Rdg tr53 (gr 1-3) *.59 
S^it 4.58 <gr 7 2-3[) 
Lang 4.26 (gr 3) 



Rdg 



in 
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jTabl e , con t inued . 



Matched Studies Licking Evidence of Initial Equality 



Bowman, 
1971 



1-6 



Hopkins, 3-4 
Old ridge, 
& William- 
eon, 
1965 



Burling- 
ton, NC 



457 



Los Angeles 330 
Co., CA (45 cl.) 
(Suburban) 



1 yr. 



3 yrs, 



Coupe red sluden te 

;W^NG* *!6-:iii6hodl : 

^higfrei^ 
NG classes used 
indiv, inet., tew 
teaching, learning 
centers* ■ v- 

Compared students in 
KB, G schools, con- 
trolling for IQV NG 
started higher in.IQ. 



Gr, 1-3 +,06 
Gr. 4-6 +.52 



Rdg +.29 
Math .28 



Reading 



I ... ." ..^'M'^V . 1 i*'-:^\.v;"v.*/y' ft 
■: : ' v -?-irr..^ c .- r 



+ .29 



+ .02 



o 
en 
I 



Ross, 
1967 



1-3 



Kill- 
ough, 
1972 



1-8 



Blooming- 
ton, IN 
(Lab school) 



314 



Houston, 
TX 



Carbone, aee 
1961 design 



300 
(4 schj 



244 
(6 sch.) 



1 yr. 



3 yrs. 



see 
design 



Compared students in NG, 
G classes in same univ- 
ersity lab school, con- 
trolling for pretest and 
IQ. Pretest and IQ means 
not given. Scores are 
grade equivalents adjust- 
ed for preteats and IQ, 

Compared students in 1 NG, 
3 G schools. No evidence 
of initial equality. 
AN00VA controlled for IQ. 

Compared students in NG, G 
schools controlling for IQ« 
Students uere in gradeo 4-6, 
NG had been used in grades 
1-3. NG started substant- 
ially higher in IQ. 



KEY: NG » Nongraded Program 

G = Graded Program (Control) 
+ & Reaulta clearly favor nongraded claases 
(+) a Reaults generally favor nongraded classes 
0 s No trend in results 
(-) *= Results generally favor graded classes 
- * Results clearly favor graded classes 



Rdg +.06GE 
Math +.06GE 



Rdg + 
Math + 



Rdg - 
LA - 
Math - 



+.06GE 



>.SV. : . 



- ^ ^V , V» ^,r: .--'.< • ; • % ■ '-O «- _ TABLE *T V ■ 

v,...^^ .1..- . J-* 1 - ■ ■;;- ,JV ?" 



-•■ J • \ . ( .; Sample ' "i" -v.:./.; 

Article Grades ^Location 'y^Sige ; ■■ Duration 

Hitched Studiea With Evidence of Initial Equality 



Hitfkeyg 1-3 -v; Pitt eburgh H348 3 yre. s 

190 ■ (Catholic (14 ech.) 



Design 



Compared MS and G 
schools matched ' 
oh SES. IQ's id- 
awticil^- NG prog- 
ram not described. 



Brody, 
1970 



1-2 Penney 1- 268 Cr. 1 - I yr. 
vania (3 sch.) Gr. 2 - 2 yrs. 



Students in NG and 
G schools .matched 
on IQ t 



Buffie* 
1962 



Otto* 
1969 



1-3 



3-5 



Remade* 5-6 
1971 



Mech- 
iele* 
1965 



234 3 yrs, 

(8 ech.) 



Austin* 
TX (Upper 
middle class* 
lab school) 



Brookings * 
SD 



15 cl. 



2 yrs, 



128 Gr. 5 - 2 yrs. 
Gr, 6 - 1 yr. 



Urbana, 

1L 



100 



Schools matched on 
SES, then students 
matched on sex* 
age, IQ. 

Compared NG and G 
classes within same 
university lab school 
Student a matched on 
gen. ach. 

Compared NG and G 
schools. Students 
matched on IQt 
Scoree are grade 
equivalents. 

Compared NG to year 
before NG introduced. 
Students matched on 
IQ t age. 



Tabic A, continued 



Matched Studies Lacking Evidence of Initial Equality 



Bovnan, 
1971 



1-6 



Hopkins* 3-4 

Oldridge, 

& Willian- 

aon, 

1965 



Rose, 
1967 



1-3 



o 
l 



Kill- 
ough* 
1972 



1-8 



Burling- 
ton. NC 



457 



Lo8 Angeles 
Co., CA 
(Suburban) 



330 
(45 cl.) 



Blooaing- 
ton» IK 
(Lab school) 



314 



Houston, 
TX 



Carbone, eee 
1961 design 



300 
(4 ach.) 



244 
(6 ach.) 



1 yr. 



3 yre, 



1 yr. 



3 yrs. 



see 
design 



Gr. 1-3 +.06 
Gr. 4-6 +.52 



Compared at udents 
1 in NG, G school a, 
controlling for 
:IQiy?-p,;eta.rted 
higher in IQ. 
NG claeeee used 
indiv. in.t. , team 
teaching, learning 
centers. 

Compa red students in 
NG, G schools, con- 
trolling for IQ. NS 
started higher in IQ. 



Compared students in NG. 
G classes in same univ- 
ersity lab school t con- 
trolling for pretest and 
IQ. Pretest and IQ means 
not given. Scores are 
grade equivalents adjust- 
ed for pretests and IQ* 

Compared atudants in 1 NG, 
3 G schools* No evidence 
of initial equality. 
ANO0VA controlled for IQ. 

Compared students in NG, G 
schools controlling for IQ. 
Students were in grades 4*6, 
NG had been used in grades 
1-3. NG started substant- 
ially higher in IQ. 



Rdg +.29 
Math .28 



1*0 



KEY: NG « Nongraded Program 

G « Graded Program (Control) 
+ ■ Results clearly favor nongraded classes 
(+) » Results generally favor nongraded classes 
0 « No trend in results 
(-) -» Results generally favor graded classes 
- * Results clearly favor graded classes 



Reading 



4.29 



+ .02 



Rdg +.06GE 
Math +.06GE 



+.06GB 



Rdg + 
Math + 



Rdg - 
LA - 
Math - 
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Table 9 
-Clati Ability Grouping 



& : vi^- 1 ''^ > ivy>.: 1 . -s :^.^--^\>y^ •. *vvt -j- 6 '- ' . / ^ 5 ; EFFECT SIZES v ^&^^w^><x;?~--<Mru 



• ? Staple ^No> of Duration 
: ^Xrt icle -^ ^radea ' ; ^Siae; ~? ^Grow^ 



bfc v,,^ ^ 

- - -^t { Achievement ^ - Subject — ^ ;Total -'^fe^^-- v ' 



• "V... .', 




O 
I 



Slay in & 

ment 1) 

Dewar. 
1964 



A-6 



Wilaiti^ton, 231 



Johnson 
Co.. KS 
(middle 
class) 



199 
(8 cl) 



5 . . riJjC^ttifti randomly : Hi. +.13 ,'. . . Math ; ; . '+.32 . 1 "ClLalz*: 




8 Clts6esr randomly |H Hi +,55 
assilgned -to,WCAQ, • Av,+.43 
control Lo +.67 



Math 



+.55 



» ■-.■■.»ftissff?*.a- .<t 



. .\;t ■ ■ 

'3;-' 
?« 



Smith. 
1960 



2-5 



Lake 



180 



Charles. (8 cl) 
LA 



Classes randomly 
assigned to WCAC, 
control, then 
students mstched 
on sex* sge. IQ. 
math ach. 



Hi +.28 
Av +.25 
Lo +.69 



Math 



+.41 



Wallen & 

Vonles, 

1960 



Salt Lake 112 
City. UT (4 cl) 



4 Claeses randomly 
assigned to WCAG* 
control » for 1 aem« 
Then claeaes switch- 
ed treataents for 
second aea. 



Math 



+.07 



12, 




■ 




f ^ . ^v ?y .^^ Matchad Studies uith Evidence of Initial Equality ■ . «> "> "\ < * . u\ ,?;^, . :71 



•^■issiiV. v V.". 



.0 

I 



Jones, 



Richmond* 
IN 



250 
(31 cl) 



8 



Matched Studioo lacking ^ Equality 



Stern, 
1972 



3.-4 V Cbvij.ta» '■ 
(tow- 

achievers) 

i 1 
i\ 

a; 



.'2*17 *l'#>xp — 
91 irjnt 
(o7 cl) 



KEY: WLAG a within claeq ability grouping 



t ;;SC^r^^a^kg«i]n^ ■ . ; ,. v , ■ % ,.-:- 1 ---^-,-.V^V,v.. >^ -t: a 

Studenta in UCAG» Hi>^8fGE ^2$ ^ : v£U ' 
control patched Av **27 ^0B4Sp*il -t»A3 



on-IQ-^ Sdores 



Compared WCA6 f 
control patched on 
general ach level f 
low achieve re only. 
Control b^gaii liigher 
at pre teste 



Math 



•'. ..V v ....:.--.\- , i.'A^i^|^, S 



+.36 '-' 



12 



125 



herjc 



* ■- 4 



v .3 

.... •• * 

> ■ } 



Si- 



correlations are known* effect si z ss f rom gain scores can be trans- 
formed to the scale of posttest values using the following multi- 



BS « (ES ) i y 2 <i-r ) 



'l«e^j»sfc^^ ..-I 



; c^^ a pre-post correlation of +0.8 was assumed* This f ig- 



Because^^ 



lire is a character i stic corr el a ti on be twee n f el 1 and spr i ng scores 



v on al ter na te forms of the -California- Achievement Test in the upper 



Sli#|ht|try ; gra^s 



•t^iiiliiief fecfc " size est imatesi:£^ ¥ 



^^A™4l> '.IIS.', *' ----- ■ * 
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scores wjpr^ front djf Cerent testes - : ^n - ^^gf^ ^ia^4 e^; • gi ; ^iair>, ^ | 
§p" ' 1 964) # . expe rimentai^cont ro| ' difference control ; stah|a^ k • v ?J 
"'K'V* ' deyiati bits were compute^ f ot %e^ ai^ ^stjt^ dilf i^rence 



ffll between these is reported as the study f s effect sis e* Since all 
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studies which met inclusion criteria presented either gain scores or 
pre- and post- test scores or matche o d on pretests, all effect sizes 



« were adjusted for initial starting points. 



-19- 



3? 'V 



i ERIC ■ .. _ ,-, ' ...... 



• 



s. 

!:■ 

V 



- -.v... .... ........ ;<J 

If studies did not present enough data to allow for computation Jf 



A" 



of effect size but otherwise met criteria for inclusion, they were M 
included in tables with an indication of the direction and consis- v ■ ^ 'M 
tentcy of any achievement differences* In some case. 3 only grade ;:^ft^p : -:- r ^ 
equivalent differences were given# and these are presented in the -.: 
table. Because the standard deviations of grade equivalents are 
around 1.0^ ^ elementary school, grade equivalent differences 
considered 
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in general , one overall effect size is presented for each study, 
Junl ess two or more different abil ity grouping plans were cpmpa red to 
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heterogeneous control groups in the • same study ( e . g» , Ca r twr igh t -and 
Mcintosh, aism) Jo riitwoydist i net Ssampl esiwe're j: ; st udiedMe s ,g :.£^rSfi||i 
1965) .Multiple effects withi n a study were averaged it o obtain the 
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.in thi s t be st-eyiide njee synthesis, ey fry effort was mifjg| to make 
each • effect size .be a .^#^|tf*i^!A|yi; ■* > sil' 



|f ability grouping on student posttest achievement, holding the post- 
test standard dev if tion as the common mittic. in all tafele^, rah- 
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demised studies are listed first, followed by matched studies pre- 
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